The gen auto-trait problem
— 2025-01-13

  1. leaking auto-traits from gen bodies
  2. preventing the leakage
  3. what about other effects and auto-traits?
  4. conclusion

One of the open questions surrounding the unstable gen {} feature is whether it should return Iterator or IntoIterator. People have had a feeling there might be good reasons for it to return IntoIterator, but couldn't necessarily articulate why. Which is why it was included in the "unresolved questions" section on the gen blocks RFC.

Because I'd like to see gen {} blocks stabilize sooner, I figured it would be worth spending some time looking into this question and see whether there are any reasons to pick one over the other. And I have found what I believe to be a fairly annoying issue with gen returning Iterator that I've started calling the gen auto-trait problem. In this post I'll walk through what this problem is, as well as how gen returning IntoIterator would prevent it. So without further ado, let's dive in!

Leaking auto-traits from gen bodies

The issue I've found has to do with auto-trait impls on reified gen {} instances. Take the thread::spawn API: it takes a closure which returns a type T. If you haven't seen its definition before, here it is:

pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T + Send + 'static,
    T: Send + 'static,

This signature states that the closure F and return type T must be Send. But what it doesn't state is that all local values created inside the closure F must be Send too - which is common in the async world. Now let's create an iterator using a gen {} block that we try and send across threads. We can create an iterator that yields a single u32:

let iter = gen {
    yield 12u32;
};

Now if we try and pass this to thread::spawn, there are no problems. Our iterator implements + Send and things will work as expected:

// ✅ Ok
thread::spawn(move || {
    for num in iter {
        println("{num}");
    }
}).unwrap();

Now to show you the problem, let's try this again but this time with a gen {} block that internally creates a std::rc::Rc. This type is !Send, which means that any type that holds it will also be !Send. That means that iter in our example here is now no longer thread-safe:

let iter = gen { // `-> impl Iterator + !Send`
    let rc = Rc::new(...);
    yield 12u32;
    rc.do_something();
};

That means if we try and send it across threads we'll get a compiler error:

// ❌ cannot send a `!Send` type across threads
thread::spawn(move || {   // ← `iter` is moved
    for num in iter {     // ← `iter` is used
        println("{num}");
    }
}).unwrap();

And that right there is the problem. Even though iterators are lazy and the Rc in our gen {} block isn't actually constructed until iteration begins on the new thread, the generator block needs to reserve space for it internally and so it inherits the !Send restriction. This leads to incompatibilities where locals defined entirely within gen {} itself end up affecting the public-facing API. This ends up being very subtle and tricky to debug, unless you're already familiar with the generator desugaring.

Preventing the leakage

Solving the gen's auto-trait problem is not that hard. What we want is for the !Send fields in the generator to not show up in the generated type until we are ready to start iterating over it. That sounds a little scary, but in practice all we have to do is for gen {} to start returning an impl IntoIterator rather than an impl Iterator. The actual Iterator will still be !Send, but our type IntoIterator will be Send:

let iter = gen { // `-> impl IntoIterator + Send`
    let rc = Rc::new(...);
    yield 12u32;
    rc.do_something();
};

Since our value iter implements Send we can now happily pass it across thread bounds. And our code continues to operate as expected, as for..in operates on IntoIterator which is implemented for all Iterator:

// ✅ Ok
thread::spawn(move || {   // ← `iter` is moved
    for num in iter {     // ← `iter` is used
        println("{num}");
    }
}).unwrap();

If you think about it, from a theoretical perspective this makes a lot of sense too. We can think of for..in as our way of handling the iteration effect, which expects an IntoIterator. gen {} is the dual of that, used to create new instances of the iteration effect. It's not at all strange for it to return the same trait that for..in expects. With the added bonus that it doesn't leak auto-traits from impl bodies to the type's signature.

What about other effects and auto-traits?

This issue primarily affects effects which make use of the generator transform. That is: iteration and async. In theory I believe that, yes, async {} should probably return IntoFuture rather than Future. In WG Async we regularly see issues that have to do with auto-traits leaking from inside function bodies out into the type's signatures. If async (2019) would have returned IntoFuture (2022) rather than Future (2019) that certainly seems like it could have helped. Though there would be a higher barrier to make that change today now that things are already stable.

On the trait side this doesn't just affect Send either: it applies to all auto-traits, present and future. Though Send is by far the most common trait people will experience issues with today, to a lesser extent this also already applies to Sync. And in the future possibly also auto-traits like Freeze, Leak, and Move. Though this shouldn't be the main motivator, preventing potential future issues is not worthless either.

Conclusion

Admittedly the worst part of gen {} returning IntoIterator is that the name of the trait kind of sucks. IntoIterator sounds auxiliary to Iterator, and so it feels a little wrong to make it the thing we return from gen {}. But on top of that: that's a lot of characters to write.

I wonder what would have happened if we'd taken an approach more similar to Swift. Swift has Sequence which is like Rust's IntoIterator and IteratorProtocol which is like Rust's Iterator. The primary interface people are expected to use is short and memorable. While the secondary interface isn't meant to be directly used, and so it has a much longer and less memorable name. As we're increasingly thinking of IntoIterator as the primary interface for async iteration, maybe we'll want to revisit the trait naming scheme in the future.

In conclusion: it seems like a good idea to prevent auto-traits from leaking from gen bodies when reified. This is the kind of issue that if we can prevent we should prevent, as it can be both difficult to diagnose and annoying to work around. Having auto-trait leakage be a non-issue for generator blocks seems worthwhile.