Async Traits Can Be Directly Backed By Manual Future Impls
— 2025-05-26

  1. introduction
  2. naively backing afits with manual future impls
  3. directly backing afits with manual future impls
  4. simple inline poll state machines
  5. conclusion
  6. acknowledgements

Introduction

There’s a fun little tidbit that most people don’t seem to be aware of when writing async functions in traits (AFITs) and that is: they allow you to directly return futures from methods. Sounds confusing? Let me illustrate what I mean with a simple example. Assume we have an a trait AsyncIterator which has an async fn next method that is defined as follows:

trait AsyncIterator {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

If we wanted to define an iterator which yields an item exactly once, we would probably write it as follows. This stores a value T in an Option, and in a call to next we extract it and return Some(T) if we have a value, and None if we don’t:

/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    async fn next(&mut self) -> Option<T> {
        self.0.take()
    }
}

Easy right? Now let’s try and write this future by hand. Rust is a systems programming language, and so it’s important that we can drop into lower levels of abstraction to gain additional control when needed. I consider it to be a gap in the language whenever we can’t break through an abstraction into its constituent parts. So let’s see what that would look like.

Naively backing AFITs with manual future impls

Ok, so what are the constituent parts of an async fn? Well, that’s mainly the Future trait. What we want to do here is to define our own future and use that as the basis of our implementation. All we do is deref an option, and take its internal value - which makes it rather simple. Naively we would probably write something like this:

/// The internal `Future` impl of our `Once` iterator
struct OnceFuture<'a, T>(&'a mut Once<T>);
impl<'a, T> Future for Once<'a, T> {
    type Output = T;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'a>) -> Poll<Self::Output> {
        // SAFETY: we're projecting into an unpinned field
        let this = unsafe { Pin::into_inner_unchecked(self) };
        Poll::Ready((&mut this.0.0).take())        
    }
}

/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    async fn next(&mut self) -> Option<T> {
        // Delegate to the `OnceFuture` impl
        OnceFuture(self).await
    }
}

There’s a lot going on all of a sudden, right? This is the low-level version of what we wrote earlier. Or is it? If you look carefully you might notice that we’re dealing with one extra level of indirection here. In our iterator’s main body we wrote: OnceFuture(self).await. In a simple benchmark the compiler will happily optimize this. But in more complex programs the compiler may not. And that’s an issue, because it means that switching to a lower-level abstraction may yield worse performance.

If this was the best we could do, it would likely spell the death of AFITs in the stdlib. It would mean that AFITs would be useful for high-level APIs that we can implement using async fn and be happy with that. But not for serious implementations which need to be optimized all the way, like those found in the stdlib. And that would point us at things like poll_*-based traits which we can more directly implement by hand.

Directly backing AFITs with manual future impls

Luckily we can trivially remove the intermediate .await call from our previous example using a little-known AFIT feature: the ability to directly return futures. Instead of writing async fn next, we can write that same function as fn next() -> impl Future. That’s almost the same; with some minor differences:

/// The internal `Future` impl of our `Once` iterator
struct OnceFuture<'a, T>(&'a mut Once<T>);
impl<'a, T> Future for Once<'a, T> {
    type Output = T;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'a>) -> Poll<Self::Output> {
        // SAFETY: we're projecting into an unpinned field
        let this = unsafe { Pin::into_inner_unchecked(self) };
        Poll::Ready((&mut this.0.0).take())        
    }
}

/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    // Directly returns the `OnceFuture` impl
    fn next(&mut self) -> impl Future<Output = Option<T>> {
        OnceFuture(self) // ← no more `.await`
    }
}

Do you see the fn next impl? It now directly returns the OnceFuture. And we can do this even though the trait definition explicitly defined the trait method as async fn next. This means there is no intermediate .await call required to base our AFIT-based trait on a manual future impl. Which means we’re no longer depending on the compiler to optimize this for us to have equivalent performance, meeting our goals of being able to manually lower the high-level notation into its constituent parts which we can manually control.

Simple inline poll state machines

The Rust stdlib provides a future::poll_fn convenience function that allows you to create stateless (e.g. does not hold its own state across poll calls) futures. But because it takes an FnMut closure, it can reference external state made available to it via upvars. This allows us to rewrite the chonky manual future implementation from our last two examples as a sleek inline call to poll_fn:

/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
    type Item = T;
    fn next(&mut self) -> impl Future<Output = Option<T>> {
        future::poll_fn(|_cx| /* -> Poll<Option<T>> */ {
            Poll::Ready((&mut self.value).take())
        })
    }
}

This is incredibly convenient because it allows us to quickly write some optimized inline poll-based state machine code inside trait implementations. Or really: any async context. I like to think of poll_fn as a way to quickly step into “low level async” mode, kind of like how unsafe {} allows you to step into “raw memory mode”.

This also clearly makes the Future trait the fundamental building block of everything async. Which is a simpler and more robust model than the alternative fn poll_* model; under which async operations can be implemented either in terms of Future::poll, AsyncIterator::poll_next, AsyncRead::poll_read, and so on.

Conclusion

I wrote this post because most people don’t seem to be aware that AFITs can be used like this. Though arguably it’s one of its most important capabilities. And that’s important because as I said earlier in this post: it would be really bad if the async trait space was bifurcated between:

This post shows that AFIT-based traits are both convenient to implement, and when desired provide the control needed to guarantee performance via manual poll-state machine implementations. This unifies the design space for async traits, removing the choice between “the easy API” and “the fast API”. With AFITs the easy API is the fast API.

While we used the AsyncIterator trait as an example in this post, nothing in this post so far has been specific to AsyncIterator. Being able to control the future state machines of AsyncRead, AsyncWrite, and so on isn’t any less important. But if we were to consider an AFIT-based AsyncIterator specifically with an async gen {} feature we can see that it provides three total levels of control, where a poll-based AsyncIterator trait only provides two 1:

AFIT-basedpoll_*-based
async gen {}
async fn
fn poll state machine

And all three levels of abstraction are important to be convenient to implement. It would be wrong to pick and choose. Being able to implement async traits based on async fn is important in cases where a high-level construct like gen {} is not available, like with AsyncRead, and AsyncWrite. But it’s also important for AsyncIterator to be able to implement subtrait variants like ExactSizeIterator, and be able to implement provided methods like fn size_hint. async gen {} doesn’t provide the ability to do either 2, and users shouldn’t be forced to write poll-based state machines just for that.

Acknowledgements

I’d like to thank Oli Scherer for taking the time late last year to work with me through a series of examples of AFITs backed manually implemented futures, and explaining how they are resolved and optimized by the compiler. That conversation is what has given me the confidence to make the claims in this post with a degree of certainty I would otherwise not have. Oli did however not review this post in advance, so I’m most definitely not speaking on their behalf and any mistakes in this post will be my own.

1

I’m sure with enough language magic we could provide some way to allow fn poll_*-based traits to be implemented in terms of async fn. But that require an entirely new language feature, all to just end up with a less consistent and more complicated model than if we directly base all async traits (but Future) on async fn.

2

Capability-based arguments aside, I also believe that implementing traits should be easy. Like, what do you mean implementing a non-blocking iterator by hand requires a PhD in Rustology? Rust is supposed to make systems programming accessible and understandable. And that needs to apply to every level of abstraction.