Type inference for private fields

In this post I will describe one of the hurdles I had with Rust recently. I’ll explain one of my bad habits of using trait objects instead of concrete stack allocated types due to Rust’s inability to use the type inferrer for private fields within structs.

At the time of writing this post I am in the middle of developing a Rust program using #![no_std] which means there are no Vec or Box types available. Nonetheless, there is the possibility to implement a custom allocator which then allows the use of Box and Vec as well. That’s what I did in my project. Within my project, I have to iterate over various data types most of the time which just ask for various Iterator implementations to abstract away some of the complexities arising from some data structures like the following:

let data: [u32,u32,u32,u32,u32,Redirect<u32>, Redirect<Redirect<u32>>, Redirect<Redirect<Redirect<u32>>>] = ...;

This array describes locations expressed by the type u32. Redirect<T> points to another location where a fixed size of Ts are located which also describe locations. Additionally the data is meant to be read serially from start to end and the first occurrence of 0 should terminate the reading process. This sounds like a perfect match for an Iterator. Therefore I divided the array into 4 parts:

  • [u32; fixed_size]
  • Redirect<u32>
  • Redirect<Redirect<u32>>
  • Redirect<Redirect<Redirect<u32>>>

where each part has a concrete iterator which conforms to Iterator<Item=u32>. For the final iterator I am chaining together the separate iterator instances and add a TakeWhile iterator to the end to enforce the 0-termination. I created a struct Foo which then stores the iterator internally and yields the next element from inner for its own Iterator implementation.

pub struct Foo {
    inner: Box<Iterator<Item = u32>>
}

Subconsciously I stored the Iterator in a Box but I realized that this is not the only option for at least the following reasons:

  • I do not depend on any external iterators passed by reference.
  • I know how the iterator is composed of (including the concrete types).
  • I am not going to expose the iterator to the users of Foo.

So why did I use a boxed Iterator? I simply didn’t want to explicitly write the type. I’m just too lazy. Therefore I thought: “Let’s store the iterator on the stack!”. As you may expect, this quickly escalated into a 5h refactoring session simply to express the type of a more complicated iterator.

Before I started refactoring the code, I asked a fellow Rust developer whether he knows of a way on how to abbreviate the type declaration for a complicated iterator and he wrote me this:

Q: Why does this line have 860 characters? Don’t you know that a single line should not exceed…
A: That’s just the type.
Q: … oh

In the end the line had 1706 characters.

In order to make my life easier, I tried to use rustc’s error messages as much as possible to avoid any mistakes during deriving the concrete type. I simply set the type of inner to u32 and looked at the expected type after compiling the program which yielded something like this:

TakeWhile<Chain<Chain<Chain<vec::IntoIter<u32>, FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<u32>, [closure@src/lib.rs:71:27: 71:43]>>, Map<FlatMap<Map<FlatMap<result::IntoIter<RedirectIterator<Redirect<u32>>>, Option<Redirect<u32>>, [closure@src/lib.rs:77:27: 77:43]>, [closure@src/lib.rs:79:22: 79:70 block_device:_, block_size:_]>, Box<Iterator<Item=(u32, Arg1, u32)>>, fn((Redirect<u32>, Arg1, u32)) -> Box<Iterator<Item=(u32, Arg1, u32)>> {flatten_tree::<u32>}>, [closure@src/lib.rs:81:22: 81:45]>>, Map<FlatMap<FlatMap<Map<FlatMap<result::IntoIter<RedirectIterator<Redirect<Redirect<u32>>>>, Option<Redirect<Redirect<u32>>>,[closure@src/lib.rs:85:27: 85:43]>, [closure@src/lib.rs:86:22: 86:70 block_device:_, block_size:_]>, Box<Iterator<Item=(Redirect<u32>, Arg1, u32)>>, fn((Redirect<Redirect<u32>>, Arg1, u32)) -> Box<Iterator<Item=(Redirect<u32>, Arg1, u32)>> {flatten_tree::<Redirect<u32>>}>, Box<Iterator<Item=(u32, Arg1, u32)>>, fn((Redirect<u32>, Arg1, u32)) -> Box<Iterator<Item=(u32, Arg1, u32)>> {flatten_tree::<u32>}>, [closure@src/lib.rs:89:22: 89:45]>>, [closure@src/lib.rs:94:25: 94:36]>

Well, that’s certainly unpleasant. The first thing that jumped into my eye is that we can not use closures because they do not have a concrete type which can be referenced. Although we could box them we don’t want to do that either. Alternatively we can pass fn()s instead of closures which then have a concrete type. As a side effect of not being able to use closures, we can not rely on the data passed by each closure. We have to inject/pipe data through all the iterators so that every fn() replacement has access to its respective parameters. This will blow up the type even more. I stilled relied on Boxs for internal iterators which are used for flattening the structure. They will be removed afterwards. Now we have something like this as inner type:

TakeWhile<Chain<Chain<Chain<vec::IntoIter<u32>, Map<FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<(u32, Arg1, u32)>, fn(RedirectIterator<u32>) -> Option<(u32, Arg1, u32)>>, fn((u32, Arg1, u32)) -> u32>>, Map<FlatMap<FlatMap<result::IntoIter<RedirectIterator<Redirect<u32>>>, Option<(Redirect<u32>, Arg1, u32)>, fn(RedirectIterator<Redirect<u32>>) -> Option<(Redirect<u32>, Arg1, u32)>>, Box<Iterator<Item=(u32, Arg1, u32)>>, fn((Redirect<u32>, Arg1, u32)) -> Box<Iterator<Item=(u32, Arg1, u32)>>>, fn((u32, Arg1, u32)) -> u32>>, Map<FlatMap<FlatMap<FlatMap<result::IntoIter<RedirectIterator<Redirect<Redirect<u32>>>>, Option<(Redirect<Redirect<u32>>, Arg1, u32)>, fn(RedirectIterator<Redirect<Redirect<u32>>>) -> Option<(Redirect<Redirect<u32>>, Arg1, u32)>>, Box<Iterator<Item=(Redirect<u32>, Arg1, u32)>>, fn((Redirect<Redirect<u32>>, Arg1, u32)) -> Box<Iterator<Item=(Redirect<u32>, Arg1, u32)>>>, Box<Iterator<Item=(u32, Arg1, u32)>>, fn((Redirect<u32>, Arg1, u32)) -> Box<Iterator<Item=(u32, Arg1, u32)>>>, fn((u32, Arg1, u32)) -> u32>>, fn(&u32) -> bool>

Having arrived at this type, rustc still complains:

error: expected fn pointer, found fn item

Honestly I didn’t know what that meant. Therefore I searched online and found this issue #20178 from 2014. In a nutshell, I think the concrete type of a fn() does not automatically coerce to a compatible fn() type therefore we have to explicitly cast the type to the target type. Which means, for every iterator method like flat_map() or map() we have to cast the type like so:

let new_iter = iter.map(helper_function as fn(u32) -> u32);

After removing the Box<Iterator<_>>s from the various helper fn()s we arrive at the following type which finally compiles:

struct Foo {
    inner: TakeWhile<Chain<Chain<Chain<vec::IntoIter<u32>, Map<FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<(u32, Arg1, u32)>, fn(RedirectIterator<u32>) -> Option<(u32, Arg1, u32)>>, fn((u32, Arg1, u32)) -> u32>>, Map<FlatMap<FlatMap<result::IntoIter<RedirectIterator<Redirect<u32>>>, Option<(Redirect<u32>, Arg1, u32)>, fn(RedirectIterator<Redirect<u32>>) -> Option<(Redirect<u32>, Arg1, u32)>>, FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<(u32, Arg1, u32)>, fn(RedirectIterator<u32>) -> Option<(u32, Arg1, u32)>>,fn((Redirect<u32>, Arg1, u32)) -> FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<(u32, Arg1, u32)>, fn(RedirectIterator<u32>) -> Option<(u32, Arg1, u32)>>>, fn((u32, Arg1, u32)) -> u32>>, Map<FlatMap<FlatMap<FlatMap<result::IntoIter<RedirectIterator<Redirect<Redirect<u32>>>>, Option<(Redirect<Redirect<u32>>, Arg1, u32)>, fn(RedirectIterator<Redirect<Redirect<u32>>>) -> Option<(Redirect<Redirect<u32>>, Arg1, u32)>>, FlatMap<result::IntoIter<RedirectIterator<Redirect<u32>>>, Option<(Redirect<u32>, Arg1, u32)>, fn(RedirectIterator<Redirect<u32>>) -> Option<(Redirect<u32>, Arg1, u32)>>,fn((Redirect<Redirect<u32>>, Arg1, u32)) -> FlatMap<result::IntoIter<RedirectIterator<Redirect<u32>>>, Option<(Redirect<u32>, Arg1, u32)>, fn(RedirectIterator<Redirect<u32>>) -> Option<(Redirect<u32>, Arg1, u32)>>>, FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<(u32, Arg1, u32)>, fn(RedirectIterator<u32>) -> Option<(u32, Arg1, u32)>>, fn((Redirect<u32>, Arg1, u32)) -> FlatMap<result::IntoIter<RedirectIterator<u32>>, Option<(u32, Arg1, u32)>, fn(RedirectIterator<u32>) -> Option<(u32, Arg1, u32)>>>, fn((u32, Arg1, u32)) -> u32>>, fn(&u32) -> bool>
}

Quite the type! And it’s only 1200 bytes big /s. Not the most lightweight type but at least we can store the iterator on the stack if we do not have access to the heap. Although we do fairly simple data transformations, storing the transformer for future uses is a real pain. Although we have a fairly good type inferrer in Rust, we can not use it for inferring the type of a struct’s field. In my opinion this is a big short coming when developing in Rust. Additionally, I can expect that some newcomers who just want to play around for the first time and don’t know about trait objects yet, will be quite surprised/deterred when doing simple data transformations, wanting to store the transformer intuitively on the stack and being offered with one of the above error messages. Therefore I would like to propose the possibility to use type inference for private field types based on destructive assignments within the enclosing module. I want to restrict the use of type inference to private fields due to the following reasons:

  • Keep the whole concept as simple as possible.
  • Nobody should expose a type like inner on a public interface no matter what.
  • Avoid unwanted breaking changes on a public interface during code refactoring.

I could imagine the following syntaxes, though I’m fine with anything as long as the type inferrer is involved:

struct Foo {
    pub id: u32,
    inner: _,
}
struct Foo {
    pub id: u32,
    inner,
}
struct Foo {
    pub id: u32,
    #[infer]
    inner,
}

Additionally I would like to propose type inference for return types of private functions. This comes hand in hand with private fields although we could emulate the behavior by returning a wrapped type.

I could imagine the following syntaxes:

fn foo(id: u32) -> _;

#[infer(Type)] 
fn foo(id: u32) -> Type;

In case of conflicting destructive assignments for a private field, the compiler should simply list all possible candidates until there is no conflict anymore. Of course, this should only work for types which could be stored in a struct beforehand. Storing a Trait or &Trait etc. should still yield an error. Furthermore I’m not quite sure if this could help with the unknown closure types for the various iterator methods. Also one needs additional syntax for referring to a inferred types if you want to reuse a type as an input parameter for another private function. I would like to have a syntax similar to associated types as in Iterator::Item such that Self::inner is possible.

In conclusion, we can actually store pretty complicated iterator structures on the stack but doing so takes considerable effort from the developer which should not be the case. I proposed a new feature using Rust’s type inferrer in order to make a developer’s life easier. I most certainly did not look at all edge cases or what could happen if you do this or that. I simply want to know whether this concept would work in Rust.

Thanks for staying till the end.


The mind agrees.