Async Iteration III: The Async Iterator Trait
— 2023-09-26

  1. fn poll_ vs async fn
  2. performance
  3. self-referential iterators
  4. unpin bounds
  5. implementation vs usage
  6. object safety
  7. "cancellation safety"
  8. pin ergonomics
  9. evolution
  10. conclusion

This post is part of the Async Iteration series:

  1. Async Iteration I: Async Iteration Semantics
  2. Async Iteration II: The Async Iterator Crate
  3. Async Iteration III: The Async Iterator Trait (this post)

Async Functions in Traits (AFIT) are in the process of being stabilized and I figured it would be a good time to look more closely at the properties they provide. In this post I want to compare AFIT-based traits with poll-based traits, using the "async iterator" trait as the driving example. But most everything which applies to async iterator, will also apply to other traits such as async read and async write.

In this post I will make the case that the best direction for the stdlib is to base its async traits on AFITs. The intended audience for this post is primarily my fellow members of WG-Async, as well as members of T-Lang and T-Libs. To read a summary of the findings jump ahead to the conclusion. This post assumes readers are familiar with the inner workings of Rust's async systems, as well as a familiarity of the tradeoffs being discussed.

fn poll_ vs async fn

To provide some flavor to what I'm talking about, in this post we'll be discussing the "async iterator" trait, asking the question whether we should base it on fn poll_next or async fn next. Here are both variants side-by-side:

// Using `fn poll_next`.
trait AsyncIterator {
    type Item;
    fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>>;
}
// Using `async fn next`.
trait AsyncIterator {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

I expect pretty much everyone will agree that on a first look the async fn next-based trait seems easier to use. Rather than needing to think about what Pin is, or how Poll works, we can just write our async functions the way we usually do, and it will just work. Pretty neat!

But that's just on the surface. Does that still hold if we look more closely? Concerns have been raised about the performance of async fn next, claiming not only would it perform less well. It's also alleged that async fn next does not provide essential features, even going so far to claim that fn poll_next is fundamentally lower-level and thus the only reasonable choice for a systems programming language. In the remainder of this post we'll be going over those claims, and show why upon closer examination they do not appear to hold.

Performance

Let's start with the most obvious one: performance. At its core Rust is a systems programming language, and in order to properly cater to its niche it tends to only provide abstractions which have comparable performance to their hand-rolled versions. The claim is that poll_next should provide better performance than async fn next since we're compiling it by hand. But when actually measured, the two approaches appear to compile to identical assembly in various configurations - meaning they will have identical performance.

But don't just take my word for it, we can use examples to substantiate this. Let's create a simple "once" future which holds some data, and when polled it will return that data. Rather than using complex async/.await machinery, we'll be creating a new function poll_once which constructs a dummy waker in-line and can be used to poll a future exactly once:

pub fn call_once() -> Poll<Option<usize>> {
    let mut iter = once(12usize);
    poll_once(iter.next())
}

Let's start by evaluating what this looks like when implemented using fn poll_next. We could write this as follows:

struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    fn poll_next(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
        // SAFETY: we're projecting into an unpinned field
        let this = unsafe { Pin::into_inner_unchecked(self) };
        Poll::Ready((&mut this.value).take())
    }
}

When polled we project Self into its fields, which is just the Option type. We then call .take to extract the value, or panic if there is none. This should be fairly straight forward. If poll_once creates a non-atomic dummy waker (this is just the first example), the compiler will compile this code down to the following x86 assembly (compiler explorer):

example::call_once:
        mov     eax, 1
        mov     edx, 12
        ret

This assembly basically means: "Hey I've got the constant '12' over here - please move it into the return registry and then exit the function". That's about the smallest this function can be without being inlined. Now let's see what happens if we implement this code using async fn next. Instead of fn poll_next we can use an async function directly:

pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    async fn next(&mut self) -> Option<T> {
        self.0.take()
    }
}

It's nice we don't have to perform pin projections anymore (more on that later). But what's the performance like? Well, if this was slower we'd expect it to generate more assembly. So let's take a look (compiler explorer):

example::call_once:
        mov     eax, 1
        mov     edx, 12
        ret

The assembly is identical! Why is that? Well, for one: the Rust compiler is pretty good at generating fast code. But we're also in a bit of a simplified environment. So far in our examples we've not using "real" thread-safe wakers, instead basing our wakers on Rc. What happens if we switch to Arc-based wakers? Here's the link to a compiler explorer comparing the two. It now generates a lot more assembly than before (yay atomics), but luckily we can use diff(1) to compare the output:

example::call_once:
        ; 21 lines of assembly + calls to another 118 lines
yosh@MacBook-Pro scratch % pbpaste > one.rs
yosh@MacBook-Pro scratch % pbpaste > two.rs
yosh@MacBook-Pro scratch % diff one.rs two.rs
yosh@MacBook-Pro scratch %

The diff output is empty, meaning there are no differences even if we Arcs to correctly construct our wakers, it just generates a lot more code. But okay fine, maybe there are more differences? After all: fn poll_next has access to the Waker and can return Poll, meaning it has low-level control over the future state machine while async fn next does not. What happens if we want to provide low-level control over the future state machine from async fn next?

Luckily we've stabilized a simple mechanism for this already: std::future::poll_fn. This function provides the ability to access the low-level internals of any future, including AFITs. Let's lower our example to make use of this, shall we?

pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    async fn next(&mut self) -> Option<T> {
        future::poll_fn(|_cx| /* -> Poll<Option<T>> */ {
            // We have access to `cx` here which contains the `Waker`.
            Poll::Ready((&mut self.value).take())
        }).await
    }
}

This seems simple enough: whenever we want to do anything low-level inside of an async fn, we can use poll_fn to drop into the future state machine. This should work not just for the async version of the iterator trait, but for all async traits. There is more to be said about how this interacts with pinning and self-referential types, but we'll cover that in more detail later on in the post. To close this out though: what does this compile to if we call it using "real" wakers? (compiler explorer):

example::call_once:
        ; 21 lines of assembly + calls to another 118 lines
yosh@MacBook-Pro scratch % pbpaste > two.rs
yosh@MacBook-Pro scratch % diff one.rs two.rs
yosh@MacBook-Pro scratch %

That's right: the output remains the same. This gives us a pretty good clue about what is happening here. Inside the compiler async fn next is desugared to a future, just like fn poll_next is. And because of basic inlining and const-folding optimizations, the resulting state machines are identical - which means that the resulting assembly is identical too. This is exactly how zero-cost abstractions are supposed to work, and is the entire premise of Rust's async system. If we ever find a case where the optimizer doesn't perform those basic optimizations we can then treat that as a bug in the compiler - not a limitation of the design.

Self-Referential Iterators

When people say that "async iterator is not the async version of iterator" they are correct. Well, sort of. If we look at existing implementations that is true: it doesn't quite work like the async version of iterator. Instead what it really is is the async version of "pinned iterator". Which is not a trait we currently have, but there certainly is a case to be made for it. Instead it's better to ask whether async iterator should be the "async version of iterator" - and I certainly believe it should be 1.

1

Incidentally that has also been the framing of the trait WG-async has been communicating to T-lang and T-libs, who have signed off on it. I'm not suggesting that this decision should bind us (I don't like to rules lawyer). What I'm instead trying to show with this is that this has been an accepted framing of what the design should achieve for years now, and we've already rejected the framing that "async iterator" (or "stream") should be its own special thing. That certainly can be changed again, but it is not a novel insight by any stretch.

Let me explain what I mean by this using examples. In Rust the base iterator API has an associated type Item, a function next which takes a mutable reference to self, and returns an Option<Self::Item>:

trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}

If we did a direct translation to async Rust, we'd have an API which instead of exposing an fn next exposed an async fn next. The only real difference here is the addition of the async keyword:

trait AsyncIterator {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

However, when we look at the ecosystem Stream trait, or the currently unstable AsyncIterator API they are not implemented in terms of async fn next. Instead they provide an fn poll_next which takes both a pinned reference to self, a mutable reference to the waker context, and wrap the return type in Poll:

trait AsyncIterator {
    type Item;
    fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>>;
}

In the previous section we've already discussed how you can get access to the waker context from inside an async function by using poll_fn. So we can pretty much ignore the waker context and the Poll in the return type. That leaves the change in the self type. Our async fn next takes &mut self, while this variant takes Pin<&mut Self>. This isn't necessary for the core functionality of async iterator, since it is pinning more than needed. So simply put, what we've just written is in fact the async version of this trait:

trait PinnedIterator {
    type Item;
    fn next(self: Pin<&mut Self>) -> Option<Self::Item>;
}

Here we have a non-async version of iterator which takes self as a pinned reference. This is useful if you ever need to write an iterator which can operate on self-referential structs. For example: if we ever start thinking of stabilizing generator functions, we want them to be able to hold references across yield points. That will require self-referential iterators.

An important insight of this is that the question of whether iterator should be pinned is orthogonal to whether it is async. Which is illustrated by the fact that we can reformulate a "pinned asynchronous iterator" just fine using AFITs:

trait PinnedAsyncIterator {
    type Item;
    async fn next(self: Pin<&mut Self>) -> Option<Self::Item>;
}

This can be combined with the poll_fn function as we showed in the previous section to recreate the low-level semantics of fn poll_next, providing access to both a pinned self-type and the future's waker argument. To put it plainly: "is async" and "is pinned" are orthogonal features, and fn poll_next needlessly combines both.

Unpin Bounds

People occasionally ask me about the Unpin trait when I talk about async versions of traits. For example if you compare Iterator::next and futures::stream::StreamExt::next, you will see that the latter has an extra where Self: Unpin bound.

// `Iterator::next`
fn next(&mut self) -> Option<Self::Item>;

// `FuturesExt::next`
fn next(&mut self) -> Next<'_, Self>
where
    Self: Unpin; // This is different

This extra Unpin bound is only needed when a trait is implemented in terms of poll functions - which by design take Pin<&mut Self>. And so we need a way to later on opt out of those bounds. You can see this same mechanism in action with the other poll-based traits, such as AsyncWriteExt::write which also has a Self: Unpin bound.

Instead if we recognize that the async counterparts to Rust's core traits don't actually need to be pinned in order to be implemented, we can drop Pin<&mut Self> from the signature. And since our type isn't pinned to begin with, we no longer have to opt-out of it being pinned via Unpin meaning all the extra Unpin bounds go away. You can see this in action in the async-iterator crate which provides a diverse range of methods on async iterator, none of which require additional Unpin bounds to function.

Implementation vs Usage

To stay on the topic of API-shapes: one major downside of fn poll_next is that the "method to implement" and "method to call" are different methods. In the regular iterator trait, there is only one method next which is both implemented and called. Instead the poll-API is only meant to be implemented, and in virtually all cases the next function is the one you want to call. This is a major deviation of how all other traits work in the stdlib today.

This isn't just limited to async Iterator either. Presumably we'd want to adapt this approach for all traits in the stdlib. That means users of async Rust would need to think of traits in the stdlib as somehow "different", and remember that they cannot directly implement the methods they're calling. Among others, the following APIs would be affected:

poll-based stdlib traits

trait nameto be implementedto be calledis same?
async Readfn poll_readasync fn read
async Writefn poll_writeasync fn write
async BufWritefn poll_fill_bufasync fn fill_buf
async Seekfn poll_seekasync fn seek

Instead, if we base these traits on async fn, the method to implement and the method to call are identical. And as we've covered earlier, if anyone would want to manually author a poll-based state machine for any of these traits, poll_fn provides a uniform way to do so for all async traits:

async-fn based stdlib traits

trait nameto be implementedto be calledis same?
async Readasync fn readasync fn read
async Writeasync fn writeasync fn write
async BufWriteasync fn fill_bufasync fn fill_buf
async Seekasync fn seekasync fn seek

This might seem like a minor point, but we have to consider that every deviation from existing norms is a point of friction for users. To zoom out slightly: I don't believe that async Rust inherently needs to be much more difficult than regular Rust. But the missing language features, combined with subtle differences like these, eventually add up and create an experience which is sufficiently different that the resulting system feels like an entirely different language. When in reality it does not need to be. For good measure here are the existing non-async stdlib traits:

non-async stdlib traits

trait nameto be implementedto be calledis same?
Readfn readfn read
Writefn writefn write
BufWritefn fill_buffn fill_buf
Seekfn seekfn seek

Object Safety

So far we've only discussed the implementation side of the traits. However that isn't the complete story, and we need to consider auto traits and other subtle semantics too. So let's start looking at those, starting with object-safety. Out of the box poll-based traits are dyn-safe. Say we wanted to implement a poll-based version of async iterator which can produce an infinite number of meows, we could create a dyn variant like so (playground):

struct Cat;
impl AsyncIterator for Cat {
    type Item = String;
    fn poll_next(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
        Poll::Ready(Some("meow".to_string()))
    }
}

fn dyn_iter() -> Box<dyn AsyncIterator<Item = String>> {
    Box::new(Cat {})
}

This is the same as any other dyn-safe trait, and doesn't requiring any additional steps. Nice! Now what happens if we try and rewrite it to use async fn next. Well, we would probably try and write it like so (playground):

#![feature(async_fn_in_trait)]

struct Cat;
impl AsyncIterator for Cat {
    type Item = String;
    async fn next(&mut self) -> Option<Self::Item> {
        Some("meow".to_string())
    }
}

fn dyn_iter() -> Box<dyn AsyncIterator<Item = String>> {
    Box::new(Cat {})
}

However if we now try and compile this code we now get the following error:

error[E0038]: the trait `AsyncIterator` cannot be made into an object
  --> src/lib.rs:16:22
   |
16 | fn dyn_iter() -> Box<dyn AsyncIterator<Item = String>> {
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `AsyncIterator` cannot be made into an object
   |
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
  --> src/lib.rs:5:14
   |
3  | trait AsyncIterator {
   |       ------------- this trait cannot be made into an object...
4  |     type Item;
5  |     async fn next(&mut self) -> Option<Self::Item>;
   |              ^^^^ ...because method `next` is `async`
   = help: consider moving `next` to another trait

This would not happen if we were using the async-trait crate, it only happens if we use the AFIT language feature. But what exactly is going on here? In order for async traits to work in dyn contexts, we need to find a place to store them first. In the async-trait crate this is always a Box, but in stable Rust we can't just do that because we've pinky-promised not to perform any "hidden" allocations 2. Solving this is pretty complicated, and will require a feature like dyn* to land. With dyn* rather than calling Box::new you'd need to call a "dyn adapter" instead. For example:

2

Though in the past we have used allocations in language features as placeholders: that's how async/.await was originally stabilized in 2019. It wasn't until a year or so later that support landed for async functions which didn't box in their desugaring.

fn dyn_iter() -> Box<dyn AsyncIterator<Item = String>> {
    Boxed::new(Cat {}) // NOTE: using the `Boxed` dyn-adapter, not `Box`.
}

Needing to replace Box::new with Boxed::new is not a major difference. And it's still unclear what the upper bound on the ergonomics of dyn async traits are, since work on it has been paused for the past year in favor of stabilizing AFIT first. But it's going to be pretty confusing if certain async traits require a Box, but other async traits require a Boxed notation.

I believe the better direction is for all async traits in the stdlib to use the same mechanism to work with dyn, even if initially it's slightly less convenient at first. We can then gradually work on improving those ergonomics for all async traits, both in the stdlib and ecosystem. This ensures consistency and enables us to design solutions which are shared by the entire async ecosystem and preventing a possible permanent bifurcation of the async trait space.

"Cancellation Safety"

Another subtle outcome here is the interaction between async traits and the property has been historically called: "cancellation safety". I'm putting it in quotes because the property is not actually about whether it is "safe to cancel" a future. A few months ago I gave a talk about this, which I'll summarize in this section. I'll explain what "cancellation safety" is, how it currently falls short, and how may be able to fix those shortcomings. I particularly want to show how we can bring the "cancellation safety" property into the type system, which could enable async functions (including AFIT) to automatically provide it.

"Cancellation safety" refers to the ability to drop and recreate futures without any "side-effects" being observable. It's important to note that the word "safety" in this context has nothing to do with memory safety: in the absence of linear types, all types in Rust must be memory-safe to drop, including futures. The "safety" in "cancellation safety" refers to logical correctness only. Being able to drop and recreate futures is the most relevant to the select! macro which commonly does that in a loop. But it can also come in useful if you're ever authoring future state machines by hand.

/// # Cancel safety
///
/// This method is cancel safe. If you use it as the event in a
/// `tokio::select!` statement and some other branch completes
/// first, then it is guaranteed that no data was read.

As I've mentioned, "cancellation safety" as a property currently only exists in documentation. This means that in order to learn whether you can drop and recreate futures without any actions you need to consult the documentation for the method. The difference between async fn next and fn poll_next is that for the former the property will be documented on the implementation, whereas for the latter the property will be documented on the trait definition. That practically means that with fn poll_next you'll be able to remember that all calls to next are "cancellation-safe". Whereas with async fn next you'll need to remember that for each implementation (or more likely: remember which impls aren't "cancellation-safe").

But we should be legitimately asking ourselves whether using documentation for this is the best we can do. RTFM is not really how we do things in the Rust project, instead much preferring if the compiler can tell you when you've messed up, which can then explain what to do instead. This is possible because of Rust's "type safety" guarantee, and APIs such as select! are decidedly not type-safe right now. Which is why even after successfully compiling code, people regularly find runtime bugs in their select! statements.

I believe a better direction would be to bring "cancellation safety" out of the docs and into the type system. We could do this by introducing a new subtrait, which I'll refer to in this post as AtomicFuture (but we can call it anything we like really, the semantics are what I care about most). We already have some precedent for this in the iterator family of traits, where for example the unstable TrustedLen trait provides additional guarantees on top of the existing Iterator trait. I imagine this would look something like this:

//! ⚠️  This uses a placeholder name and is intended as a design-sketch only. ⚠️

/// A future which performs at most a single async operation.
trait AtomicFuture: Future {}

One of the downsides of the current "cancellation safety" system is that we have to manually inspect anonymous futures whether they're cancellation-safe or not. I believe by bringing "cancellation safety" into the type system the compiler should be able to figure it out automatically if the following cases are met:

  1. No local state: The async context must contain at most one .await point. That way if a future is cancelled it will never occur between two .await points.
  2. Virality: All futures awaited inside the async context implement the trait.

Any async fn or async {} block would automatically be able to implement AtomicFuture if those requirements are upheld, which I believe is something which shouldn't be too hard to figure out from the generated state machine 3. Maybe there is a case to be made for syntax for this too; I'm not sure. But that's something we can figure out later.

3

Also: this is something we'll want to do regardless if we ever get generator functions. It would be pretty bad if gen fn couldn't automatically implement marker traits on the type it returns.

// ✅ -> impl Future + AtomicFuture
async fn foo() -> u32 { 12 }

// ✅ -> impl Future + AtomicFuture
async fn foo<F: AtomicFuture>(fut: F) -> F::{
    fut.await
}

// ❌ -> impl Future
async fn foo<F: Future>(fut: F) -> F::{
    fut.await
}

// ❌ -> impl Future
async fn foo<F: AtomicFuture>(fut1: F, fut2: F) {
    fut1.await;
    fut2.await;
}

The compiler will only be able to automatically figure out whether AtomicFuture can be implemented for futures returned by async fn and async {}. For manually implemented futures the author of the future will need to uphold those guarantees themselves. The bar to meet there is that the future should only perform a single operation, and then return. read is a good example of a "stateless future", while the future join operation is a good example of a future which is not.

As a basis I think this is pretty good, but there are still some cases we haven't covered yet. For example tokio's Mutex::lock operation: only performs one operation, but when it's dropped and recreated it'll be the last in the unlock queue again - which in pathological cases could lead to certain unlocks never being scheduled. It's not marked "cancellation-safe" in the tokio docs, but the reason for that is not covered by the rules we've described so far.

I think practically we'll want to play around with the rules a little. I don't think we want or even can practically nail down what a "side-effect" is in Rust. But if we intentionally don't make the presence of AtomicFuture a safety invariant, we can probably get away by adding broad rules such as:

"Cancelling and recreating an AtomicFuture should not cause logic errors"

What does that exactly mean? That's up for interpretation. But that might be fine for our purposes. I think it's worth experimenting a little, so we can settle on something that works for us. Ideally a definition which can be automatically implemented by the compiler for async blocks and functions, but if we can't that's probably also okay. The key takeaway here should be that if we care enough about "cancellation-safety" to make it a guarantee of our public APIs, we should be trying to bring it into the trait system instead.

Pin Ergonomics

I think if we're to have an honest conversation about poll-based APIs, we should acknowledge that working with Pin is not a great experience for most people. I'm a world-leading expert in async Rust, and four years post stabilization I still regularly struggle to use it. I've seen folks on both T-lang and T-libs struggle with it. I've seen very capable engineers at Microsoft struggle with it. And if experienced folks struggle to use Pin, I don't think we can reasonably expect those with less experience to use it without much problems either.

I've covered some of the unique difficulties of Pin on this blog before. I believe pin is hard to use in part because pin inverts Rust's usual semantics via a combination of auto-traits and wrapper structs. Another reason is because it relies heavily on the concept of "pin-projection" which requires unsafe and is not directly represented in either the language or the stdlib. Which in turn means that even the most common interactions with Rust's pinning system rely third-party ecosystem crates such as pin-project and until recently pin-utils.

We currently don't have a clear path to fix Pin's ergonomics. If we want Rust to provide a way to make pin projections safe, we'll at least need to make a breaking change to the Unpin trait 4. As well as a whole lot of design work to integrate it into the language. And while this may be something we'll want to do eventually, we really need to ask ourselves whether this is something we want to do now. Because taking on one project necessarily means not taking on another.

4

Safe pin projections in the language necessarily require that Unpin is an unsafe trait. It's currently not unsafe, and changing it would be a breaking change.

I honestly think async functions in traits, async drop, async iteration, and async closures all are far more important things to work on right now, and they should take priority in WG-async over directly fixing Pin's rough edges. The most effective strategy to reduce the cost of Pin we have in Rust right now is to simply make pin less common. If Pin is only needed to implement intrusive collections, self-referential types, and manual future state machines then for the time being we think of it as an experts-only system. And once we have more bandwidth we can revisit it later to tidy it up.

Evolution

A thing that worries me in particular about fn poll_next is its inability to co-evolve with Rust's needs. As we've seen we're not just considering adding an "async version of iterator", we're also talking about "a pinned version of iterator". But there are other flavors of iterator being discussed as well, such as: "a fallible version of iterator", "a lending version of iterator", "a const version of iterator", and so on. It's impossible to know today which requirements Rust will have ten years from now. All we really know is that the decisions we make then will be affected by the decisions we make today 5.

5

Ten years ago we were unsure whether it was even possible to publish a new systems programming language. Five years ago we were unsure whether Rust would be able to escape its browser niche and reach mainstream adoption. Flash forward to today, and Rust is being used in the hearts of major operating systems (Android, Linux, and Windows). As well as every packet on the internet more likely than not being routed through some Rust code at some point (Cloudflare, AWS, and Azure). I don't think we can reasonably anticipate all the requirements Rust will face ten years from now. So I believe one of the most important things for a programming language is to find ways to keep evolving the language in the face of changing requirements.

Say for a second we did add an fn poll_next-based version of iterator. If we wanted to say, add a lending version of iterator. Then we'd probably also want to add a lending version of async iterator too. That would be four separate families of traits, and that's just for three variants. If we actually wanted to add support for "pinned iterators" or "fallible iterators" we'd be looking at nine or maybe even seventeen different families of traits. And nobody reasonably wants that.

This is the reason why I believe async fn next is the better direction. If we add it to the stdlib via an effect-generic mechanism, both "iterator" and "async iterator" could be served using a single trait which is generic over the async effect. But we could use the same mechanism for other traits such as Read and Write, but also maybe some less common traits like Into and Display.

//! A version of iterator which can be implemented as either sync or async.
//! This uses placeholder syntax just to illustrate the idea.

#[maybe_async]
trait Iterator {
    type Item;
    #[maybe_async]
    fn next(&mut self) -> Option<Self::Item>;
}

fn poll_next forces us to duplicate existing interfaces in order to expose those same capabilities. While async fn next enables us to extend existing interfaces with new capabilities. It doesn't immediately solve all of the limitations of iterator; effect generics won't provide an answer for how to add support for "lending" or "pinned" iterator variants. But it provides an answer to some other problems, like how to add support for async, const, and fallibility. And in doing so encourages us to find similar solutions for the problems which remain, even if we don't yet know what they'll look like.

Conclusion

In this post I've presented a case for basing the async Iterator trait on async fn next over fn poll_next. To summarize the arguments:

I believe basing the async Iterator trait on async fn next to be superior across all axes, and I hope this post sufficiently makes that case. In a future post I'd like to round out this series by covering the desugaring of an async iteration notation. Together with an RFC for effect-generic trait definitions, this will be one of the last steps necessary step before we can re-RFC RFC 2996: Async Iterator to cover the full scope of async iteration.

Thanks to Eric Holk for reviewing an earlier draft of this post.