A survey of every iterator variant
— 2025-02-12

  1. introduction
  2. base iterator
  3. bounded iterator
  4. fused iterator
  5. thread-safe iterator
  6. dyn-compatible iterator
  7. double-ended iterator
  8. seeking iterator
  9. compile-time iterator
  10. lending iterator
  11. iterator with a return value
  12. iterator with a next argument
  13. short-circuiting iterator
  14. address-sensitive iterator
  15. iterator guaranteeing destruct
  16. async iterator
  17. concurrent iterator
  18. conclusion

Introduction

Oh yeah, you like iterators? Name all of them. —

I'm increasingly of the belief that before we can meaningfully discuss the solution space, we have to first make an effort to build consensus on the problem space. It's hard to plan for a journey together if we don't know what the trip will be like along the way. Or worse: if we don't agree in advance where we want the journey to take us.

In Rust the Iterator trait is the single-most complex trait in the stdlib. It provides 76 methods, and by my estimate (I stopped counting at 120) has around 150 trait implementations in the stdlib alone. It also features a broad range of extension traits like FusedIterator and ExactSizeIterator that provide additional capabilities. And it is itself a trait-based expression of one of Rust's core control-flow effects; meaning it is at the heart of a lot of interesting questions about how we combine such effects.

Despite all that we know the Iterator trait falls short today and we would like to extend it to do more. Async iteration is one popular capability people are missing as a built-in. Another is the ability to author address-sensitive (self-referential) iterators. Less in the zeitgeist but equally important are iterators that can lend items with a lifetime, conditionally terminate early, and take external arguments on each call to next.

I'm writing this post to enumerate all of the iterator variants in use in Rust today. So that we can scope the problem space to account for all of them. I'm also doing this because I've started hearing talk about potentially (soft-)deprecating Iterator. I believe we only get one shot at running such a deprecation 1, and if we are going to do one we need to make sure it's a solution that will plausibly get us through all of our known limitations. Not just one or two.

So without further ado: let's enumerate every variation of the existing Iterator trait we know we want to write, and discuss how those variants interact with each other to create interesting new kinds of iterators.

1

I'm basing this off of my experience being a part of the Node.js Streams WG through the mid-2010s. By my count Node.js now features five distinct generations of stream abstractions. Unsurprisingly it's not particularly pleasant to use, and ideally something we should avoid replicating in Rust. Though I'm not opposed to running deprecations of core traits in Rust either, I believe that in order to do that we want to be approaching 100% certainty that it'll be the last deprecation we do. If we're going to embark on a decade plus of migration issues, we have to make absolutely sure it's worth it.

Base Iterator

Iterator is a trait representing the stateful component of iteration; IntoIterator represents the capability for a type to be iterated over. Here is the Iterator trait as found in the stdlib today. The Iterator and IntoIterator traits are closely linked, so throughout this post we will show both to paint a more complete picture.

Throughout this post we will be covering variations on Iterator which provide new capabilities. This trait is represents the absence of those capabilities: it is blocking, cannot be evaluated during compilation, is strictly sequential, and so on. Here is what the foundation of the core::iter submodule looks like today:

/// An iterable type.
pub trait IntoIterator {
    /// The type of elements yielded from the iterator.
    type Item;
    /// Which kind of iterator are we turning this into?
    type IntoIter: Iterator<Item = Self::Item>;
    /// Returns an iterator over the elements in this type.
    fn into_iter(self) -> Self::IntoIter;
}

/// A type that yields values, one at the time.
pub trait Iterator {
    /// The type of elements yielded from the iterator.
    type Item;
    /// Advances the iterator and yields the next value.
    fn next(&mut self) -> Option<Self::Item>;
    /// Returns the bounds on the remaining length of the iterator.
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

While not strictly necessary: the associated IntoIterator::Item type exists for convenience. That way people using the trait can directly specify the Item using impl IntoIterator<Item = Foo>, which is a lot more convenient than I: <IntoIterator::IntoIterator as Iterator>::Item, etc.

Bounded Iterator

The base Iterator trait represents a potentially infinite (unbounded) sequence of items. It has a size_hint method that returns how many items the iterator still expects to yield. That value is however not guaranteed to be correct, and is intended to be used for optimizations only. From the docs:

Implementation notes It is not enforced that an iterator implementation yields the declared number of elements. A buggy iterator may yield less than the lower bound or more than the upper bound of elements.

size_hint() is primarily intended to be used for optimizations such as reserving space for the elements of the iterator, but must not be trusted to e.g., omit bounds checks in unsafe code. An incorrect implementation of size_hint() should not lead to memory safety violations.

The Rust stdlib provides two Iterator subtraits that allow it to guarantee it is bounded: ExactSizeIterator (stable) and TrustedLen (unstable). Both traits take : Iterator as its super-trait bound and on its surface seem nearly identical. But there is one key difference: TrustedLen is unsafe to implement allowing it to be used to guarantee safety invariants.

/// An iterator that knows its exact length.
pub trait ExactSizeIterator: Iterator {
    /// Returns the exact remaining length of the iterator.
    fn len(&self) -> usize { .. }
    /// Returns `true` if the iterator is empty.
    fn is_empty(&self) -> bool { .. }
}

/// An iterator that reports an accurate length using `size_hint`.
#[unstable(feature = "trusted_len")]
pub unsafe trait TrustedLen: Iterator {}

ExactSizeIterator has the same memory-safety guarantees as Iterator::size_hint meaning: you can't rely on it to be correct. That means that if you're say, collecting items from an iterator into a vec, you can't omit bounds checks and use ExactSizeIterator::len as the input to Vec::set_len. However if TrustedLen is implemented bounds checks can be omitted, since the value returned by size_hint is now a safety invariant of the iterator.

Fused Iterator

When working with an iterator, we typically iterate over items until the iterator yields None at which point we treat the iterator as "done". However the documentation for Iterator::next includes the following:

Returns None when iteration is finished. Individual iterator implementations may choose to resume iteration, and so calling next() again may or may not eventually start returning Some(Item) again at some point.

It's rare to work with iterators which yield None and then resume again afterwards, but the iterator trait explicitly allows it. Just like it allows for an iterator to panic if next is called again after None has been yielded once. Luckily the FusedIterator subtrait exists, which guarantees that once None has been yielded once from an iterator, all future calls to next will continue to yield None.

/// An iterator that always continues to yield `None` when exhausted.
pub trait FusedIterator: Iterator {}

Most iterators in the stdlib implement FusedIterator. For iterators which aren't fused, it's possible to call the Iterator::fuse combinator. The stdlib docs recommend never taking a FusedIterator bound, instead favoring Iterator in bounds and calling fuse to guarantee the fused behavior. For iterators that already implement FusedIterator this is considered a no-op.

The FusedIterator design works because Option::None is idempotent: there is never a scenario where we can't create a new instance of None. Contrast this with enums such as Poll that lack a "done" state - and you see ecosystem traits such as FusedFuture attempting to add this lack of expressivity back through other means. The need for an idempotent "done" state will be important to keep in mind as we explore other iterator variants throughout this post.

Thread-Safe Iterator

Where bounded and fused iterators can be obtained by refining the Iterator trait using subtraits, thread-safe iterators are obtained by composing the Send and Sync auto-traits with the Iterator trait. That means there is no need for dedicated SendIterator or SyncIterator traits. A "thread-safe iterator" becomes a composition of Iterator and Send / Sync:

struct Cat;

// Manual `Iterator` impl
impl Iterator for Cat { .. }

// These impls are implied by `Send`
// and `Sync` being auto-traits
unsafe impl Send for Cat {}
unsafe impl Sync for Cat {}

And when taking impls in bounds, we can again express our intent by composing traits in bounds. I've made the argument before that taking Iterator directly in bounds is rarely what people actually want, so the bound would end up looking like this:

fn thread_safe_sink(iter: impl IntoIterator + Send) { .. }

If we also want the individual items yielded by the iterator to be marked thread-safe, we have to add additional bounds:

fn thread_safe_sink<I>(iter: I)
where
    I: IntoIterator<Item: Send> + Send,
{ .. }

While there is no lack of expressivity here, at times bounds like these can get rather verbose. That should not be taken as indictment of the system itself, but rather as a challenge to improve the ergonomics for the most common cases.

Dyn-Compatible Iterator

Dyn-compatibility is another axis that traits are divided on. Unlike e.g. thread-safety, dyn-compatibility is an inherent part of the trait and is governed by Sized bounds. Luckily both the Iterator and IntoIterator traits are inherently dyn-compatible. That means they can be used to create trait objects using the dyn keyword:

struct Cat;
impl Iterator for Cat { .. }

let cat = Cat {};
let dyn_cat: &dyn Iterator = &cat; // ok

There are some iterator combinators such as count that take an additional Self: Sized bound 2. But because trait objects are themselves sized, it all mostly works as expected:

2

Admittedly my knowledge of dyn is spotty at best. Out of everything in this post, the section on dyn was the bit I had the least intuition about.

let mut cat = Cat {};
let dyn_cat: &mut dyn Iterator = &mut cat;
assert_eq!(dyn_cat.count(), 1); // ok

Double-Ended Iterator

Often times iterators over collections hold all data in memory, and can be traversed in either direction. For this purpose Rust provides the DoubleEndedIterator trait. Where Iterator builds on the next method, DoubleEndedIterator builds on a next_back method. This allows items to be taken from both the logical start and end of the iterator. And once both cursors meet, iteration is considered over.

/// An iterator able to yield elements from both ends.
pub trait DoubleEndedIterator: Iterator {
    /// Removes and  an element from the end of the iterator.
    fn next_back(&mut self) -> Option<Self::Item>;
}

While you'd expect this trait to be implemented for e.g. VecDeque, it's interesting to note that it's also implemented for Vec, String, and other collections that only grow in one direction. Also unlike some of the other iterator refinements we've seen, DoubleEndedIterator has a required method that is used as the basis for several new methods such as rfold (reverse fold) and rfind (reverse find).

Seeking Iterator

Both the Iterator and Readtraits in Rust provide abstractions for streaming iteration. The main difference is that Iterator works by returning arbitrary owned types when calling next, where Read is limited to reading bytes into buffers. But both abstractions keep a cursor that keeps track of which data has been processed already, and which data still needs processing.

But not all streams are created equally. When we read data from a regular file 3 on disk we can be sure that, as long as no writes occurred, we can read the same file again and get the same output. That same guarantee is however not true for sockets, where once we've read data from it we can't read that same data again. In Rust this distinction is surfaced via the Seek trait, which gives control over the Read cursor in types that support it.

3

No, files in /proc are not "regular files". Yes, I know sockets are technically file descriptors.

In Rust the Iterator trait provides no mechanism to control the underlying cursor, despite its similarities to Read. A language that does provide an abstraction for this is C++ in the form of random_access_iterator. This is a C++ concept (trait-alike) that further refines bidirectional_iterator. My C++ knowledge is limited, so I'd rather quote the docs directly than attempt to paraphrase:

[...] random_access_iterator refines bidirectional_iterator by adding support for constant time advancement with the +=+-=, and - operators, constant time computation of distance with -, and array notation with subscripting [].

Being able to directly control the cursor in Iterator implementations could prove useful when working with in-memory collection types like Vec, as well as when working with remote objects such as paginated API endpoints. The obvious starting point for such a trait would be to mirror the existing io::Seek trait and adapt it to be a subtrait of Iterator:

/// Enumeration of possible methods to seek within an iterator.
pub enum SeekFrom {
    /// Sets the offset to the specified index.
    Start(usize),
    /// Sets the offset to the size of this object plus the specified index.
    End(isize),
    /// Sets the offset to the current position plus the specified index.
    Current(isize),
}

/// An iterator with a cursor which can be moved.
pub trait SeekingIterator: Iterator {
    /// Seek to an offset in an iterator.
    fn seek(&mut self, pos: SeekFrom) -> Result<usize>;
}

Compile-Time Iterator

In Rust we can use const {} blocks to execute code during compilation. Only const fn functions can be called from const {} blocks. const fn free functions and methods are stable, but const fn trait methods are not. This means that traits like Iterator are not yet callable from const {} blocks, and so neither are for..in expressions.

We know we want to support iteration in const {} blocks, but we don't yet know how we want to spell both the trait declaration and trait bounds. The most interesting open question here is how we will end up passing the Destruct trait around, which is necessary to enable types to be droppable in const contexts. This leads to additional questions around whether const trait bounds should imply const Destruct. And whether the const annotation should be part of the trait declaration, the individual methods, or perhaps both.

This post is not the right venue to discuss all tradeoffs. But to give a sense of what a compile-time-compatible Iterator trait might look like: here is a variant where both the trait and individual methods are annotated with const:

pub const trait IntoIterator {                        // ← `const`
    type Item;
    type IntoIter: const Iterator<Item = Self::Item>; // ← `const`
    const fn into_iter(self) -> Self::IntoIter;       // ← `const`
}

pub const trait Iterator {                                     // ← `const`
    type Item;
    const fn next(&mut self) -> Option<Self::Item>;            // ← `const`
    const fn size_hint(&self) -> (usize, Option<usize>) { .. } // ← `const`
}

Lending Iterator

While there are plenty of iterators in the stdlib today that yield references to items, the values that are being referenced are never owned by the iterator itself. In order to write iterators which owns the items it yields and yields them by-reference, the associated item type in iterator needs a lifetime:

pub trait IntoIterator {
    type Item<'a>                                               // ← Lifetime
    where
        Self: 'a;                                               // ← Bound
    type IntoIter: for<'a> Iterator<Item<'a> = Self::Item<'a>>; // ← Lifetime
    fn into_iter(self) -> Self::IntoIter;
}

pub trait Iterator {
    type Item<'a>                                 // ← Lifetime
    where
        Self: 'a;                                 // ← Bound
    fn next(&mut self) -> Option<Self::Item<'_>>; // ← Lifetime
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

There has been talk about adding a 'self lifetime to the language as a shorthand for where Self:'a. But even with that addition this set of signatures is not for the faint of heart. The reason for that is that it makes use of GATs, which are both useful and powerful — but are for all intents and purposes an expert-level type-system feature, and can be a little tricky to reason about.

Lending iteration is also going to be important when we add dedicated syntax to construct iterators using the yield keyword. The following example shows a gen {} block which creates a string and then yields a reference to it. This perfectly4 matches onto the Iterator trait we’ve defined:

4

Remember that strings are heap-allocated so we don’t need to account for address-sensitivity. While the compiler can't yet perform this desugaring, there is nothing in the language prohibiting it.

let iter = gen {
    let name = String::from("Chashu");
    yield &name; // ← Borrows a local value stored on the heap.
};

Iterator with a Return Value

The current Iterator trait has an associated type Item, which maps to the yield keyword in Rust. But it has no associated type that maps to the return keyword. A way to think about iterators today is that their return type is hard-coded to unit. If we want to enable generator functions and blocks to be written which can not just yield, but also return, we'll need some way to express that.

let counting_iter = gen {
    let mut total = 0;
    for item in iter {
        total += 1;
        yield item;      // ← Yields one type.
    }
    total                // ← Returns another type.
};

The obvious way to write a trait for "an iterator which can return" is to give Iterator an additional associated item Output which maps to the logical return value. In order to be able to express fuse semantics, the function next needs to be able to return three different states:

  1. Yielding the associated type Item
  2. Returning the associated type Output
  3. The iterator has been exhausted (done)

One way to do this would be for next to return Option<ControlFlow>, where Some(Continue) maps to yield, Some(Break) maps to return, and None maps to done. Without a final "done" state, calling next again on an iterator after it's finished yielding values would likely always have to panic. This is what most futures do in async Rust, and is going to be a problem if we ever want to guarantee panic-freedom.

pub trait IntoIterator {
    type Output;                                            // ← Output
    type Item;
    type IntoIter:
        Iterator<Output = Self::Output, Item = Self::Item>; // ← Output
    fn into_iter(self) -> Self::IntoIter;
}

pub trait Iterator {
    type Output;                                           // ← Output
    type Item;
    fn next(&mut self)
        -> Option<ControlFlow<Self::Output, Self::Item>>;  // ← Output
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

The way this would work on the caller side and in combinators is quite interesting. Starting with the provided for_each method: it will want to return Iterator::Output rather than () once the iterator has completed. Crucially the closure F provided to for_each only operates on Self::Output, and has no knowledge of Self::Output. Because if the closure did have direct knowledge of Output, it could short-circuit by returning Output sooner than expected which is a different kind of iterator than having a return value.

fn for_each<F>(self, f: F) -> Self::Output // ← Output
where
    Self: Sized,
    F: FnMut(Self::Item),
{ .. }

If we were to transpose "iterator with return value" to for..in things get even more interesting still. In Rust loop expressions can themselves evaluate to non-unit types by calling break with some value. for..in expressions cannot do this in the general case yet, except for the handling of errors using ?. But it's not hard to see how this could be made to work, conceptually this is equivalent to calling Iterator::try_for_each and returning Some(Break(value)):

let ty: u32 = for item in iter {  // ← Evaluates to a `u32`
    break 12u32                   // ← Calls `break` from the loop body
};

Assuming we have an Iterator with its own return value, that would mean for..in expressions would be able to evaluate to non-unit return types without calling break from within the loop's body:

let ty: u32 = for item in iter {  // ← Evaluates to a `u32`
    dbg!(item);                   // ← Doesn't `break` from the loop body
};

This of course leads to questions about how to combine an "iterator with a return value" and "using break from inside a for..in expression". I'll leave that as an exercise to the reader on how to spell out (I'm certain it can be done, I just think it's a fun one). Generalizing all modes of early returns from for..in expressions to invocations of for_each combinators is an interesting challenge that we'll cover in more detail later on when we discuss short-circuiting (fallible) iterators.

Iterator with a Next Argument

In generator blocks the yield keyword can be used to repeatedly yield values from an iterator. But what if the caller doesn't just want to obtain values, but also pass new values back into the iterator? That would require for yield to be able evaluate to a non-unit type. Iterators that have this functionality are often referred to as "coroutines", and they are being particularly useful when implementing I/O-Free Protocols.

/// Some RPC protocol
enum Proto {
    /// Some protocol state
    MsgLen(u32),
}

let rpc_handler = gen {
    let len = message.len();
    let next_state = yield Proto::MsgLen(len); // ← `yield` evaluates to a value
    ..
};

In order to support this, Iterator::next needs to be able to take an additional argument in the form of a new associated type Args. This associated type has the same name as the input arguments to Fn and AsyncFn. If "iterator with a next argument" can be thought of as representing a "coroutine", the Fn family of traits can be thought of as representing a regular "routine" (function).

pub trait IntoIterator {
    type Item;
    type Args;                                          // ← Args
    type IntoIter:
        Iterator<Item = Self::Item, Args = Self::Args>; // ← Args
    fn into_iter(self) -> Self::IntoIter;
}

pub trait Iterator {
    type Item;
    type Args;                                                  // ← Args
    fn next(&mut self, args: Self::Args) -> Option<Self::Item>; // ← Args
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

Here too it's interesting to consider the caller side. Starting with a hypothetical for_each function: it would need to be able to take an initial value passed to next, and then from the closure be able to return additional values passed to subsequent invocations of next. The signature for that would look like this:

fn for_each<F>(self, args: Self::Args, f: F) // ← Args
where
    Self: Sized,
    F: FnMut(Self::Item) -> Self::Args,      // ← Args
{ .. }

To better illustrate how this would work and build up some intuition, consider manual calls to next inside of a loop. We would start by constructing some initial state that is passed to the first invocation of next. This will produce an item that can be used to construct the next argument to next. And so on, until the iterator has no more items left to yield:

let mut iter = some_value.into_iter();
let mut arg = some_initial_value;

// loop as long as there are items to yield
while let Some(item) = iter.next(arg) {
    // use `item` and compute the next `arg`
    arg = process_item(item);
}

If we were to iterate over an iterator with a next function using a for..in expression, the equivalent of returning a value from a closure would be either to continue with a value. This could potentially also be the final expression in the loop body, which you can think of itself being an implied continue today. The only question remaining is how to pass initial values on loop construction, but that mainly seems like an exercise in syntax design:

// Passing a value to the `next` function seems like
// it would logically map to `continue`-expressions.
// (passing initial state to the loop intentionally omitted)
for item in iter {
    continue process_item(item);   // ← `continue` with value
};

// You can think of `for..in` expressions as having
// an implied `continue ()` at the end. Like functions
// have an implied `return ()`. What if that could take
// a value?
// (passing initial state to the loop intentionally omitted)
for item in iter {
    process_item(item)             // ← `continue` with value
};

"iterator with return value" and "iterator with next argument" feel like they map particularly well to break and continue. I think of them as duals, enabling both expressions to carry non-unit types. This feels like it might be an important insight that I haven't seen anyone bring up before.

Being able to pass a value to next is one of the hallmark features of the Coroutine trait in the stdlib. However unlike the trait sketch we provided here, in Coroutine the type of the value passed to next is defined as a generic param on the trait rather than an associated type. Presumably that is so that Coroutine can have multiple implementations on the same type that depend on the input type. I searched whether this was actually being used today, and it doesn't seem to be. Which is why I suspect it is likely fine to use an associated type for the input arguments.

Short-Circuiting Iterator

While it is possible to return Try-types such as Result and Option from an iterator today, it isn't yet possible is to immediately stop execution in the case of an error [^not-panic]. This behavior is typically referred to as "short-circuiting": halting normal operation and triggering an exceptional state (like an electrical breaker would in a building).

5

By that I mean: an error, not a panic.

In unstable Rust we have try {} blocks that rely on the unstable Try trait, but we don't yet have try fn functions. If we want these to function in traits the way we'd want them to, they will have to desugar to impl Try. Rather than speculate about what a potential try fn syntax might look like in the future, we'll be writing our samples using -> impl Try directly. Here is what the Try (and FromResidual) traits look like today:

/// The `?` operator and `try {}` blocks.
pub trait Try: FromResidual {
    /// The type of the value produced by `?`
    /// when _not_ short-circuiting.
    type Output;
    
    /// The type of the value passed to `FromResidual::from_residual`
    /// as part of ? when short-circuiting.
    type Residual;
    
    /// Constructs the type from its `Output` type.
    fn from_output(output: Self::Output) -> Self;

    /// Used in ? to decide whether the operator should produce
    /// a value ( or propagate a value back to the caller.    
    fn branch(self) -> ControlFlow<Self::Residual, Self::Output>;
}

/// Used to specify which residuals can be converted
/// into which `Try` types.
pub trait FromResidual<R = <Self as Try>::Residual> {
    /// Constructs the type from a compatible `Residual` type.
    fn from_residual(residual: R) -> Self;
}

You can think of a short-circuiting iterator as a special case of an "iterator with return value". In its base form, it will only return early in case of an exception while its logical return type remains hard-coded to unit. The return type of fn next should be an impl Try returning an Option, with the value of Residual set to the associated Residual type. This allows all combinators to share the same Residual, enabling types to flow.

pub trait IntoIterator {
    type Item;
    type Residual;
    type IntoIter: Iterator<
        Item = Self::Item,
        Residual = Residual   // ← Residual
    >;
    fn into_iter(self) -> Self::IntoIter;
}

pub trait Iterator {
    type Item;
    type Residual;                   // ← Residual
    fn next(&mut self) -> impl Try<  // ← impl Try
        Output = Option<Self::Item>,
        Residual = Self::Residual,   // ← Residual
    >;
}

If we once again consider the caller, we'll want to provide a way for for..in expressions to short-circuit. What's interesting here is that the base iterator trait already provides a try_for_each, method. The difference between that method and the for_each we're about to see is how the type of Residual is obtained. In try_for_each the value is local to the method, while if the trait itself is "short-circuiting" the type of Residual is determined by the associated Self::Residual type. Or put differently: in a short-circuiting iterator the type we short-circuit with is a property of the trait rather than a property of the method.

fn for_each<F, R>(self, f: F) -> R                  // ← Return type
where
    Self: Sized,
    F: FnMut(Self::Item) -> R,                      // ← Return type
    R: Try<Output = (), Residual = Self::Residual>, // ← `impl Try`
{ .. }

As mentioned earlier on in this post: the interaction between "iterator with a return type" and "short-circuiting iterator" is an interesting one. Returning Option<ControlFlow> from fn next is able to encode three distinct states, but this combination of capabilities requires us to encode four states:

  1. yield a next item
  2. break with a residual
  3. return a final output
  4. iterator done (idempotent)

The reason why we want to be able to encode a signature like this is because when writing gen fn functions it is entirely reasonable to want to have a return type, short-circuit on error using ?, and also yield values. This works like regular functions today, but with the added capability to invoke yield. The naive way of encoding this would be to return an impl Try of Option<ControlFlow<_>> with distinct associated types for Item, Output, and Residual. This does however feel like it is starting to get a little out of hand, though perhaps a first-class try fn notation might provide some relief.

pub trait IntoIterator {
    type Item;
    type Output;                           // ← `Output` 
    type Residual;                         // ← `Residual` 
    type IntoIterator: Iter<
        Item = Self::Item,
        Residual = Self::Residual,         // ← `Residual` 
        Output = Self::Output,             // ← `Output` 
    >;
    fn into_iter(self) -> Self::IntoIter;
}

pub trait Iterator {
    type Item;
    type Output;                     // ← `Output`
    type Residual;                   // ← `Residual`
    fn next(&mut self) -> impl Try<  // ← `impl Try` 
        Output = Option<ControlFlow< // ← `ControlFlow
            Self::Output,            // ← `Output
            Self::Item,
        >>,
        Residual = Self::Residual,   // ← `Residual` 
    >;
}

Address-Sensitive Iterator

Rust's generator transformation may create self-referential types. That is: types which have fields that borrow from other fields on the same type. We call these types "address-sensitive" because once the type has been constructed, its address in memory must remain stable. This comes up when writing gen {} blocks that have stack-allocated locals 6 that are kept live across yield-expressions. What is or isn't a "stack-allocated local" can get a little complicated. But it's important to highlight that for example calling IntoIterator::into_iter on a type and re-yielding all items is something that just works (playground):

6

This affects heap-allocated locals too, but that's not a limitation of the language, only one of the impl.

// This example works today
let iter = gen {
    let cat_iter = cats.into_iter();
    for cat in cat_iter {
        yield cat;
    }
};

And to give a sense of what for example does not work, here is one of the samples Tmandry (T-Lang) collected. This creates an intermediate borrow, which results in the error: "Borrow may still be in use when gen fn body yields" (playground):

gen fn iter_set_rc<T: Clone>(xs: Rc<RefCell<HashSet<T>>>) ... {
    for x in xs.borrow().iter() {
        yield x.clone();
    }
}

In order to enable examples like the last one to work, Rust needs to be able to express some form of "address-sensitive iterator". The obvious starting point would be to mint a new trait PinnedIterator which changes the self-type of next to take a Pin<&mut Self> rather than &mut self:

trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;
    fn into_iter(self) -> Self::IntoIter;
}

trait Iterator {
    type Item;
    fn next(self: Pin<&mut Self>) -> Option<Self::Item>; // ← `Pin<&mut Self>`
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

Enumerating all the problems of Pin is worth its own blog post. But still, it seems important enough to point out that this definition has what Rust for Linux calls The Safe Pinned Initialization Problem. IntoIterator::into_iter cannot return a type that is address-sensitive at time of construction, instead address-sensitivity is something that can only be guaranteed at a later point once the type is pin!ned in place.

At the start of this post I used the phrase: "(soft-)deprecation of the Iterator trait". With that I was referring to one proposed idea to enable gen {} to return a new trait Generator with the same signature as our example. As well as some bridging impls for the purposes of compatibility. The core of the compat system would be as follows:

/// All `Iterator`s are `Generator`s
impl<I: IntoIterator> IntoGenerator for I {
    type Item = I::Item;
    type IntoGen = IteratorGenerator<I::IntoIter>;
    fn into_gen(self) -> Self::IntoGen {
        IteratorGenerator(self.into_iter())
    }
}

/// Only pinned `Generator`s are `Iterator`s
impl<G> Iterator for Pin<G>
where
    G: DerefMut,
    G::Target: Generator,
{
    type Item = <<G as Deref>::Target as Generator>::Item;
    fn next(&mut self) -> Option<Self::Item> {
        Generator::next(self.as_mut())
    }
}

This creates a situation that I've been describing as "one-and-a-half-way compat", as opposed to the usual two-way-compat. And we need two-way-compat to not be a breaking change. This leads to a situation where changing a bound from taking Iterator to Generator is backwards-compatible. But changing an impl from returning an Iterator to returning a Generator is not backwards-compatible. The obvious solution then would be to migrate the entire ecosystem to take Generator bounds everywhere. Coupled with gen {} only ever returning Generator and not Iterator: that is a deprecation of Iterator in all but name.

At first sight it might seem like we're being forced into deprecating Iterator because of the limitations of Pin. The obvious answer to that would be to solve the issues with Pin by replacing it with something better. But that creates a false dichotomy: there is nothing forcing us to make a decision on this today. As we established at the start of this section: a surprising amount of use cases already work without the need for address-sensitive iterators. And as we've seen throughout this post: address-sensitive iteration is far from the only feature that gen {} blocks will not be able to support on day one.

Iterator Guaranteeing Destruct

The current formulation of thread::scope requires that the thread it’s called on remains blocked until all threads have been joined. This requires stepping into a closure and executing all code within that. Contrast this with something like FutureGroup which logically owns computations and can be freely moved around. The values of the futures resolved within can in turn be yielded out. But unlike thread::scope it can’t guarantee that all computations will complete, and so a parallel version of FutureGroup can't mutably hold onto mutable borrows the way thread::scope can.

// A usage example of `thread::scope`,
// the ability to spawn new threads
// is only available inside the closure.
thread::scope(|s| {
    s.spawn(|| ..);
    s.spawn(|| ..);
                    // ← All threads are joined here.
});

// A usage example of `FutureGroup`,
// no closures required.
let mut group = FutureGroup::new();
group.insert(future::ready(2));
group.insert(future::ready(4));
group.for_each(|_| ()).await;

If we want to write a type similar to FutureGroup with the same guarantees as thread::scope, we'd either need to guarantee that FutureGroup can never be dropped or guarantee that FutureGroup's Drop impl is always run. It turns out that it's rather impractical to have types that can't be dropped in a language where every function may panic. So the only real option here are to have types whose destructors are guaranteed to run.

The most plausible way we know of to do this would be by introducing a new auto trait Leak, disallowing types from being passed to mem::forget, Box::leak, and so on. For more on the design, read Linear Types One-Pager. Because Leak is an auto-trait, we could compose it with the existing Iterator and IntoIterator traits, similar to Send and Move:

fn linear_sink(iter: impl IntoIterator<IntoIter: ?Leak>) { .. }

Async Iterator

In Rust the async keyword can transform imperative function bodies into state machines that can be manually advanced by calling the Future::poll method. Under the hood this is done using what is called a coroutine transform, which is the same transform we use to desugar gen {} blocks with. But that's just the mechanics of it; the async keyword in Rust also introduces two new capabilities: ad-hoc concurrency and ad-hoc cancellation. Together these capabilities can be combined to create new control-flow operations, like Future::race and Future::timeout.

Async Functions in Traits were stabilized one year ago in Rust 1.75, enabling the use of async fn in traits. Making the Iteratortrait work with async is mostly a matter of adding an async prefix to next:

trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;
    fn into_iter(self) -> Self::IntoIter;
}

trait Iterator {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;      // ← async
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

While the next method here would be annotated with the async keyword, the size_hint method should probably not be. The reason for that is that it acts as a simple getter, and it really shouldn't be performing any asynchronous computation. It's also unclear whether into_iter should be an async fn or not. There is probably a pattern to be established here, and it might very well be.

An combination of iterator variants that's been of some interest recently is an address-sensitive async iterator. We could imagine writing an address-sensitive async iterator by making next take self: Pin<&mut Self>:

trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;
    async fn into_iter(self) -> Self::IntoIter;
}

trait Iterator {
    type Item;
    async fn next(self: Pin<&mut Self>) -> Option<Self::Item>; // ← async + `Pin<&mut Self>`
    fn size_hint(&self) -> (usize, Option<usize>) { .. }
}

This signature is likely to confuse some people. async fn next return an impl Future which itself must be pinned prior to being polled. In this example we are separately requiring that Self is also pinned. That is because "the state of the iterator" and "the state of the future" are not the same state. We intuitively understand this when working with non-async address-sensitive iterators: the locals created within the next are not captured by the enclosing iterator, and are free to be stack-pinned for the duration of the call to next. But when working with an asynchronous address-sensitive iterator, for some reason people seem to assume that all locals defined in fn next now need to be owned by the iterator and not the future.

In the async Rust ecosystem there exists a popular variation of an async iterator trait called Stream. Rather than keeping the state of the iterator (self) and the next function separate, it combines both into a single state. The trait has a single method poll_next which acts as a mix between Future::poll and Iterator::next. With a provided convenience function async fn next that is a thin wrapper around poll_next.

trait IntoStream {
    type Item;
    type IntoStream: Stream<Item = Self::Item>;
    fn into_stream(self) -> Self::IntoStream;
}

pub trait Stream {
    type Item;
    fn poll_next(                            // ← `fn poll_next`
        self: Pin<&mut Self>,                // ← `Pin<&mut Self>`
        cx: &mut Context<'_>,                // ← `task::Context`
    ) -> Poll<Option<Self::Item>>;           // ← `task::Poll` 
    async fn next(&mut self) -> Self::Item   // ← `async`
    where
        Self: Unpin                          // ← `Self: Unpin`
    { .. }
    fn size_hint(&self) -> (usize, Option<usize>) { ... }
}

By combining both states into a single state, this trait violates one of the core tenets of async Rust's design: the ability to uniformly communicate cancellation by dropping futures. Here if the future by fn next is dropped, that is a no-op and cancellation will not occur. This causes compositional async control-flow operators like Future::race to not work despite compiling.

To instead cancel the current call to next you are forced to either drop the entire stream, or use some bespoke method to cancel just the future's state. Cancellation in async Rust is infamous for being hard to get right, which is understandable when (among other things) core traits in the ecosystem do not correctly handle it.

Concurrent Iterator

As we're approaching the end of our exposition here, let's talk about the most elaborate variations on Iterator. First in line: the rayon crate and the ParallelIterator trait. Rayon provides what are called "parallel iterators" which process items concurrently rather than sequentially, using operating system threads. This tends to significantly improve throughput compared to sequential processing, but have the caveat that all consumed items must implement Send. To see just how familiar parallel iterators can be: the following example looks almost identical to a sequential iterator except for the call to into_par_iter instead of into_iter.

use rayon::prelude::*;

(0..100)
    .into_par_iter()   // ← Instead of calling `into_iter`.
    .for_each(|x| println!("{:?}", x));

The ParallelIterator trait however comes as a pair with the Consumer trait. It can be a little mind-boggling but the way Rayon works is that combinators can be chained to create a handler, which at the end of the chain is copied to each thread and used there to handle items. This is of course a simplified explanation; I'll defer to Rayon maintainers to provide a detailed explanation. To give you a sense how different these traits are from the regular Iteratortraits, here they are (simplified):

/// A consumer is effectively a generalized “fold” operation.
pub trait Consumer<Item>: Send + Sized {
    /// The type of folder that this consumer can be converted into.
    type Folder: Folder<Item, Result = Self::Result>;
    /// The type of reducer that is produced if this consumer is split.
    type Reducer: Reducer<Self::Result>;
    /// The type of result that this consumer will ultimately produce.
    type Result: Send;
}

/// A type that can be iterated over in parallel
pub trait IntoParallelIterator {
    /// What kind of iterator are we returning?
    type Iter: ParallelIterator<Item = Self::Item>;
    /// What type of item are we yielding?
    type Item: Send;
        /// Return a stateful parallel iterator.
    fn into_par_iter(self) -> Self::Iter;
}

/// Parallel version of the standard iterator trait.
pub trait ParallelIterator: Sized + Send {
    /// The type of item that this parallel iterator produces.
    type Item: Send;
    /// Internal method used to define the behavior of this
    /// parallel iterator. You should not need to call this
    /// directly.
    fn drive_unindexed<C>(self, consumer: C) -> C::Result
    where
        C: UnindexedConsumer<Self::Item>;
}

What matters most here is that using the ParallelIterator trait feels similar to a regular iterator. All you need to do is call into_par_iter instead of into_iter and you're off to the races. On the consuming side it seems like we should be able to author some variation of for..in to consume parallel iterators. Rather than speculate about syntax, we can look at the signature of ParallelIterator::for_each to see which guarantees this would need to make.

fn for_each<F>(self, f: F)
where
    F: Fn(Self::Item) + Sync + Send
{ .. }

We can observe three changes here from the base iterator trait:

  1. Self no longer needs to be Sized.
  2. Somewhat predictably the closure F needs to be thread-safe.
  3. The closure F needs to implement Fn rather than FnMut to prevent data races.

We can then infer that in the case of a parallel for..in expression, the loop body would not be able to close over any mutable references. This is an addition to the existing limitation that loop bodies already can't express FnOnce semantics and move values (e.g. "Warning: this value was moved in a previous iteration of the loop".)

An interesting combination of are "parallel iteration" and "async iteration". An interesting aspect of the async keyword in Rust is that it allows for ad-hoc concurrent execution of futures without needing to rely on special syscalls or OS threads. This means that concurrency and parallelism can be detached from one another. While we haven't yet seen a "parallel async iterator" trait in the ecosystem, the futures-concurrency crate does encode a "concurrent async iterator" 7. Just like ParallelIterator, ConcurrentAsyncIterator comes in a pair with a Consumertrait.

7

Technically this trait is called ConcurrentStream, but there is little that is Stream-dependent here. I called it that because it is compatible with the futures_core::Stream trait since futures-concurrency is intended to be a production crate.

/// Describes a type which can receive data.
pub trait Consumer<Item, Fut>
where
    Fut: Future<Output = Item>,
{
    /// What is the type of the item we’re returning when completed?
    type Output;
    /// Send an item down to the next step in the processing queue.
    async fn send(self: Pin<&mut Self>, fut: Fut) -> ConsumerState;
    /// Make progress on the consumer while doing something else.
    async fn progress(self: Pin<&mut Self>) -> ConsumerState;
    /// We have no more data left to send to the `Consumer`;
    /// wait for its output.
    async fn flush(self: Pin<&mut Self>) -> Self::Output;
}

pub trait IntoConcurrentAsyncIterator {
    type Item;
    type IntoConcurrentAsyncIter: ConcurrentAsyncIterator<Item = Self::Item>;
    fn into_co_iter(self) -> Self::IntoConcurrentAsyncIter;
}

pub trait ConcurrentAsyncIterator {
    type Item;
    type Future: Future<Output = Self::Item>;

    /// Internal method used to define the behavior
    /// of this concurrent iterator. You should not
    /// need to call this directly.
    async fn drive<C>(self, consumer: C) -> C::Output
    where
        C: Consumer<Self::Item, Self::Future>;
}

While ParallelIterator and ConcurrentAsyncIterator have similarities in both usage and design, they are different enough that we cant quite think as one being the async, non-thread-safe version of the other. Perhaps it is possible to bring both traits closer to one another, so that the only difference are a few strategically placed async keywords, but more research is needed to validate whether that is possible.

Another interesting bit to point out here: concurrent iteration is also mutually exclusive with lending iteration. A lending iterator relies on yielded items having sequential lifetimes, while concurrent lifetimes rely on yielded items having overlapping lifetimes. Those are fundamentally incompatible concepts.

Conclusion

And that's every iterator variant that I know of, bringing us to 17 different variations. If we take away the variants that we can express using subtraits (4 variants) and auto-traits (5 variants), we are left with 9 different variants. That's 9 variants, with 76 methods, and approximately 150 trait impls in the stdlib. That is a big API surface, and that's not even considering all the different combinations of iterators.

Iterator is probably the single-most complex trait in the language. It is a junction in the language where every effect, auto-trait, and lifetime feature ends up intersecting. And unlike similar junctions like the Fn-family of traits; the Iterator trait is stable, broadly used, and has a lot of combinators. Meaning it both has a broad scope, and stringent backwards-compatibility guarantees we need to maintain.

At the same time Iterator is also not that special either. It's a pretty easy trait to write by hand after all. The way I think of it is mainly as a canary for language-wide shortcomings. Iterator is for example not unique in its requirement for stable addresses: we want to be able to guarantee this for arbitrary types and to be able to use this with arbitrary interfaces. I believe that the question to ask here should be: what is stopping us from using address-sensitive types with arbitrary types and arbitrary interfaces? If we can answer that, not only will we have an answer for Iterator - we will have solved it for all other interfaces we did not consciously anticipate would want to interact with this 8.

8

This is adjacent to "known unknowns" vs "unknown unknowns" - we should not just cater to cases we can anticipate, but also to cases we cannot. And that requires analyzing patterns and thinking in systems.

In this post I've done my best to show by-example which limitations the Iterator trait has today, and how each variant can overcome those. And while I believe that we should try and lift those limitations over time, I don't think anyone is too keen on us minting 9 new variations on the core::iter submodule. Nor the thousand-plus possible combinations of those submodules (yay combinatorics). The only feasible approach I see to navigating the problem space is by extending, rather than replacing, the Iterator trait. Here is my current thinking for how we might be able to extend Iterator to support all iterator variants:

9

the const effect itself is already polymorphic over the comptime effect, since const fn means: "a function that can be executed either at comptime or runtime". Out of all the effect variants, const is most likely to happen in the near-term.

10

What we want to express is that we have an associated type which MAY take a lifetime, not MUST take a lifetime. That way we can pass a type by value where a type would otherwise be expected to be passed by-reference. This is different from both 'static lifetimes and &own references.

11

I tried answering how to add return values to the Iterator trait a year ago in my post Iterator as an Alias, but I came up short. As I mentioned earlier in the post: combining "return with value" and "may short-circuit with error" seems tricky. Maybe there is a combination here I'm missing / we can special-case something. But I haven't seen it yet.

When evolving the language, I believe we entire job is to balance feature development with the management of complexity. If done right, over time the language should not only become more capable, but also simpler and easier to extend. To quote something TC (T-Lang) said in a recent conversation: "We should get better at getting better every year". 12

As we think about how we wish to overcome the challenges presented by this post, it is my sincere hope that we will increasingly start thinking of ways to solve classes of problems that just happen to show up with Iterator first. While at the same time looking for opportunities to ship features sooner, without blocking ourselves on supporting all of the use cases straight away.

12

I recently remarked to TC that I've started to think about governance and language evolution as being about the derivative of the language and project. Rather than measuring the direct outcomes, we measure the process that produces those outcomes. TC remarked that what we should really care about is the double derivative. Not only should we improve our outcomes over time; the processes that produce those outcomes should improve over time as well. Or put differently: we should get better at getting better every year! I love this quote and I wanted y'all to read it too.