Async Traits Can Be Directly Backed By Manual Future Impls
— 2025-05-26
- introduction
- naively backing afits with manual future impls
- directly backing afits with manual future impls
- simple inline poll state machines
- conclusion
- acknowledgements
Introduction
There’s a fun little tidbit that most people don’t seem to be aware of when
writing async functions in traits (AFITs) and that is: they allow you to
directly return futures from methods. Sounds confusing? Let me illustrate what I
mean with a simple example. Assume we have an a trait AsyncIterator
which has an async fn next
method that is defined as follows:
trait AsyncIterator {
type Item;
async fn next(&mut self) -> Option<Self::Item>;
}
If we wanted to define an iterator which yields an item exactly once, we would
probably write it as follows. This stores a value T
in an Option
, and in a
call to next
we extract it and return Some(T)
if we have a value, and None
if we don’t:
/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
type Item = T;
async fn next(&mut self) -> Option<T> {
self.0.take()
}
}
Easy right? Now let’s try and write this future by hand. Rust is a systems programming language, and so it’s important that we can drop into lower levels of abstraction to gain additional control when needed. I consider it to be a gap in the language whenever we can’t break through an abstraction into its constituent parts. So let’s see what that would look like.
Naively backing AFITs with manual future impls
Ok, so what are the constituent parts of an async fn
? Well, that’s mainly the
Future
trait. What we want to do here is to define our own future and use
that as the basis of our implementation. All we do is deref an option, and take
its internal value - which makes it rather simple. Naively we would probably
write something like this:
/// The internal `Future` impl of our `Once` iterator
struct OnceFuture<'a, T>(&'a mut Once<T>);
impl<'a, T> Future for Once<'a, T> {
type Output = T;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'a>) -> Poll<Self::Output> {
// SAFETY: we're projecting into an unpinned field
let this = unsafe { Pin::into_inner_unchecked(self) };
Poll::Ready((&mut this.0.0).take())
}
}
/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
type Item = T;
async fn next(&mut self) -> Option<T> {
// Delegate to the `OnceFuture` impl
OnceFuture(self).await
}
}
There’s a lot going on all of a sudden, right? This is the low-level version of
what we wrote earlier. Or is it? If you look carefully you might notice that
we’re dealing with one extra level of indirection here. In our iterator’s main
body we wrote: OnceFuture(self).await
. In a simple benchmark the compiler
will happily optimize this. But in more complex programs the compiler may not.
And that’s an issue, because it means that switching to a lower-level
abstraction may yield worse performance.
If this was the best we could do, it would likely spell the death of AFITs in
the stdlib. It would mean that AFITs would be useful for high-level APIs that we
can implement using async fn
and be happy with that. But not for serious
implementations which need to be optimized all the way, like those found in the
stdlib. And that would point us at things like poll_*
-based traits which we
can more directly implement by hand.
Directly backing AFITs with manual future impls
Luckily we can trivially remove the intermediate .await
call from our previous
example using a little-known AFIT feature: the ability to directly return
futures. Instead of writing async fn next
, we can write that same function as
fn next() -> impl Future
. That’s almost the same; with some minor differences:
/// The internal `Future` impl of our `Once` iterator
struct OnceFuture<'a, T>(&'a mut Once<T>);
impl<'a, T> Future for Once<'a, T> {
type Output = T;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'a>) -> Poll<Self::Output> {
// SAFETY: we're projecting into an unpinned field
let this = unsafe { Pin::into_inner_unchecked(self) };
Poll::Ready((&mut this.0.0).take())
}
}
/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
type Item = T;
// Directly returns the `OnceFuture` impl
fn next(&mut self) -> impl Future<Output = Option<T>> {
OnceFuture(self) // ← no more `.await`
}
}
Do you see the fn next
impl? It now directly returns the OnceFuture
. And we
can do this even though the trait definition explicitly defined the trait method
as async fn next
. This means there is no intermediate .await
call required
to base our AFIT-based trait on a manual future impl. Which means we’re no longer
depending on the compiler to optimize this for us to have equivalent
performance, meeting our goals of being able to manually lower the high-level
notation into its constituent parts which we can manually control.
Simple inline poll state machines
The Rust stdlib provides a future::poll_fn
convenience function that allows
you to create stateless (e.g. does not hold its own state across poll
calls)
futures. But because it takes an FnMut
closure, it can reference external
state made available to it via upvars. This allows us to rewrite the chonky manual future implementation from our last two examples as a sleek inline call to poll_fn
:
/// Yields an item exactly once
pub struct Once<T>(Option<T>);
impl<T> AsyncIterator for Once<T> {
type Item = T;
fn next(&mut self) -> impl Future<Output = Option<T>> {
future::poll_fn(|_cx| /* -> Poll<Option<T>> */ {
Poll::Ready((&mut self.value).take())
})
}
}
This is incredibly convenient because it allows us to quickly write some
optimized inline poll-based state machine code inside trait implementations. Or
really: any async
context. I like to think of poll_fn
as a way to quickly
step into “low level async” mode, kind of like how unsafe {}
allows you to
step into “raw memory mode”.
This also clearly makes the Future
trait the fundamental building block of
everything async
. Which is a simpler and more robust model than the
alternative fn poll_*
model; under which async operations can be implemented
either in terms of Future::poll
, AsyncIterator::poll_next
,
AsyncRead::poll_read
, and so on.
Conclusion
I wrote this post because most people don’t seem to be aware that AFITs can be used like this. Though arguably it’s one of its most important capabilities. And that’s important because as I said earlier in this post: it would be really bad if the async trait space was bifurcated between:
- AFIT-based traits: which are convenient to implement but have worse performance due to lack of control.
- Poll-based traits: which are inconvenient to implement but have better performance because of the control provided
This post shows that AFIT-based traits are both convenient to implement, and when desired provide the control needed to guarantee performance via manual poll-state machine implementations. This unifies the design space for async traits, removing the choice between “the easy API” and “the fast API”. With AFITs the easy API is the fast API.
While we used the AsyncIterator
trait as an example in this post, nothing in
this post so far has been specific to AsyncIterator
. Being able to control the
future state machines of AsyncRead
, AsyncWrite
, and so on isn’t any less
important. But if we were to consider an AFIT-based AsyncIterator
specifically
with an async gen {}
feature we can see that it provides three total levels of
control, where a poll
-based AsyncIterator
trait only provides two 1:
AFIT-based | poll_* -based | |
---|---|---|
async gen {} | ✅ | ✅ |
async fn | ✅ | ❌ |
fn poll state machine | ✅ | ✅ |
And all three levels of abstraction are important to be convenient to implement.
It would be wrong to pick and choose. Being able to implement async traits based
on async fn
is important in cases where a high-level construct like gen {}
is not available, like with AsyncRead
, and AsyncWrite
. But it’s also
important for AsyncIterator
to be able to implement subtrait
variants like ExactSizeIterator
, and be able to implement provided methods like fn size_hint
.
async gen {}
doesn’t provide the ability to do either 2, and users shouldn’t be forced to write
poll
-based state machines just for that.
Acknowledgements
I’d like to thank Oli Scherer for taking the time late last year to work with me through a series of examples of AFITs backed manually implemented futures, and explaining how they are resolved and optimized by the compiler. That conversation is what has given me the confidence to make the claims in this post with a degree of certainty I would otherwise not have. Oli did however not review this post in advance, so I’m most definitely not speaking on their behalf and any mistakes in this post will be my own.
I’m sure with enough language magic we could provide some way to allow fn poll_*
-based traits to be implemented in terms of async fn
. But that require an entirely new language feature, all to just end up with a less consistent and more complicated model than if we directly base all async traits (but Future
) on async fn
.
Capability-based arguments aside, I also believe that implementing traits should be easy. Like, what do you mean implementing a non-blocking iterator by hand requires a PhD in Rustology? Rust is supposed to make systems programming accessible and understandable. And that needs to apply to every level of abstraction.