Futures Concurrency II: A Trait Approach
— 2021-09-02

It's been exactly two years since I wrote about futures concurrency. Some work has happened since, and I figured it would be a good time to recap futures concurrency, and share some of the new developments.

In this post we'll establish a base model of async concurrency for Rust. We'll cover ergonomic new APIs to implement said model. And finally we'll describe a plausible path towards inclusion of those APIs in the stdlib in the short-term.

Modes of concurrency

Concurrency can be thought of as awaiting multiple things at the same time on the same thread. Generally we can divide the various modes of concurrency up along two axes:

Do we want to wait for all outputs, or just the first output?
When an error occurs, do we keep going or do we return early ("short circuit")?

As of August 2020, JavaScript has had the following Promise concurrency APIs available on most platforms:

	Wait for all outputs	Wait for first output
Continue on error	`Promise.allSettled`	`Promise.any`
Return early on error	`Promise.all`	`Promise.race`

JavaScript covers most concurrency uses really well. Rust has a counterpart to JavaScript's Promise type in Future, but it differs in two key ways:

Promises are always fallible, while Futures have the option not to be.
Promises start executing immediately, while Futures need to be invoked first. ¹

In terms of execution, the rough equivalent of a Promise in JavaScript is Task in Rust (starts executing immediately on creation). The rough equivalent of a Future in Rust is a "thenable" in JavaScript (starts executing once awaited).

For async-std we've been experimenting with introducing similar concurrency APIs as those found for JavaScript Promises as part of our future submodule. For Rust Futures which don't return Result we expose two methods for concurrency:

Wait for all outputs	Wait for first output
`Future::join`	`Future::race`

When we add fallability to the mix (e.g. return a Result from the future) async-std's try_ methods for concurrency become available we can compare it 1:1 to JavaScript's APIs:

	Wait for all outputs	Wait for first output
Continue on error	`Future::join`	`Future::try_race`
Return early on error	`Future::try_join`	`Future::race`

The try prefix can be thought of as inverting the semantics of the method. In the case of try_join we no longer wait for all items to complete; on error we'll now exit early and cancel all others. In the case of try_race we no longer wait for the first item to complete; on error we'll keep waiting until we either find a successful item or run out of items.

This is the model we wrote about two years ago, and aside from a rename (select -> race) it's remained identical. However despite that, we still haven't attempted to introduce this functionality into the stdlib yet. And that's because of missing ✨ language features ✨.

Joining tuples

In my post "future::join and const-eval" I talk about the issue of joining multiple futures:

[...] once we start joining more than two futures, the output types become different:

let a = future::ready(1u8);
let b = future::ready(2u8);
let c = future::ready(3u8);
assert_eq!(join!(a, b, c).await, (1, 2, 3));

let a = future::ready(1u8);
let b = future::ready(2u8);
let c = future::ready(3u8);
assert_eq!(a.join(b).join(c).await, (1, (2, 3))); // oh no!

As you can see, each invocation of Future::join returns a tuple. But that means that chaining calls to it starts to nest tuples, which becomes hard to use. And it becomes more nested the more times you chain. Oh no!

In the post we explain that until then we'd been solving this ² by using a join! macro, which by nature is variadic ³. However ideally we would be able to write methods which are generic over tuples. Not only would that allow us to write a better join function, it would also save us from having stdlib twelve different PartialEq and PartialOrd impls:

"this" means having nested return types like Join<Join<Join<A>, B>, C> which return tuples in the shape of (a, (b, c)), etc. What we want is to be able to have a single "flat" Join<A, B, C...> return type which returns (a, b, c) without any nesting. That's the challenge we're trying to solve.

"Variadic" means "can support any number of arguments". For example println! can print out any number of arguments. We can't express that using regular Rust functions yet.

Just like const generics solved the a similar issue for arrays, it seems likely like we'll eventually want to solve this for tuples too. But that's unlikely to happen anytime soon (if I'd hazard a guess, I'd say if work starts on this in 2022, then maybe we could have it by 2023?). So for now we need to work around this restriction, and luckily there are a few options.

How to expose futures concurrency

In response to "future::join and const-eval", matthieum pointed out we can invert the API instead. Rather than exposing this functionality as functions scoped under std::future, we could instead extend in-line container types with traits that allow for more fluent composition. I wrote the futures-concurrency crate to test out that design, and the design feels incredibly good. For example, this is what it looks like to await multiple similarly-typed futures without any intermediate allocations:

use std::future;
use futures_concurrency::prelude::*;

let a = future::ready(1u8);
let b = future::ready(2u8);
let c = future::ready(3u8);
assert_eq!([a, b, c].join().await, [1, 2, 3]);

And this is for multiple differently-types futures without any intermediate allocations:

use std::future;
use futures_concurrency::prelude::*;

let a = future::ready(1u8);
let b = future::ready("hello");
let c = future::ready(3u16);
assert_eq!((a, b, c).join().await, (1, "hello", 3));

But sometimes we want to allocate because we don't know how many futures we're going to want to await in parallel. So it works with Vec<Future> as well, similar to futures' join_all function:

use std::future;
use futures_concurrency::prelude::*;

let a = future::ready(1u8);
let b = future::ready(2u8);
let c = future::ready(3u8);
assert_eq!(vec![a, b, c].join().await, vec![1, 2, 3]);

This feels like the most promising approach so far. Once the right traits are in scope (2024 edition, anyone?), the way to await becomes intuitive. We no longer need to chain calls to .join().join().join() or wrap items in awkward macros. Instead we group futures together, suffix it with our preferred method of concurrency, and then types pop out on the other end. Easy.

There are some caveats to this though. The way we've implemented this for tuples is the same way the stdlib implements PartialOrd and PartialEq: by using a macro. However because we're returning custom types from these traits, it means we're currently generating twelve different Join types. Surely there must be a way around this right?

Async Traits and TAITs

Currently the futures_concurrency::Join trait is defined like this:

trait Join {
    type Output;
    type Future: Future<Output = Self::Output>;
    fn join(self) -> Self::Future;
}

We have two types: the Output type which is what our method will return once awaited. And the Future type which is the future our join method will return that needs to be awaited. However the async foundations WG is working on introducing "async traits" into the language. Once that's in place it's expected we'll be able to rewrite the trait like so:

trait Join {
    type Output;
    async fn join(self) -> Self::Output;
}

It's expected that all async traits will go this way, since it removes the need to manually implement futures using Pin ⁴ (which is a big improvement for ergonomics). However this comes with an issue: How do we manually return futures?

⁴

I consider Pin to be like unsafe {} in that it's essential for Rust to expose this - but under no circumstance should it be required to perform common operations.

The answer for that appears to be TAITS ("Type Alias Impl Trait"). This allows traits to be used in type aliases, and treated as concrete types. Async traits hasn't been RFC'd yet, but I expect we'll want to enable it to work with TAITS once they stabilize. So that would allow us to implement the async trait differently depending on whether we want to name the output type or not:

// Implement the trait using `async fn`.
impl<T> Join for Vec<T> {
    type Output = Vec<T>;
    async fn join(self) -> Self::Output;
}

// Implement the trait using with a named future.
type JoinFuture<T> = impl Future<Output = Vec<T>>;
impl<T> Join for Vec<T> {
    type Output = Vec<T>;
    fn join(self) -> JoinFuture<T>;
}

Perhaps you're seeing where I'm going with this, but for tuples we can then create a shared Join future type for all tuple variants without needing to have variadics in the language:

type JoinFuture<T> = impl Future<T>;

// Implement `Join` for tuple `(A, B)`.
impl<T, const N: usize> Join for (A, B)
where
    A: Future,
    B: Future,
{
    type Output = (A::Output, B::Output);
    fn join(self) -> JoinFuture<Self::Output> { ... }
}

// ...and repeat for all other tuples (likely using a macro).
impl<T, const N: usize> JoinTrait for (A, B, C) { ... }
impl<T, const N: usize> JoinTrait for (A, B, C, D) { ... }
impl<T, const N: usize> JoinTrait for (A, B, C, D, E) { ... }

And once we have variadics as part of the language, we could migrate the implementations to use that instead:

type JoinFuture<T> = impl Future<T>;

// Implement `Join` for all tuples of two or more items.
impl<A, B, const N: usize> Join for (A, B)
where
    A: Future,
    B...: Future, // <- Entirely made up variadic syntax
{
    type Output = (A::Output, B::Output);
    fn join(self) -> Join<Self::Output>;
}

It's important to emphasize though that the semantics of TAITS have not yet been stabilized, and are subject to change. But the introduction of TAITS may also have implications for the rest of the language; for example bringing inference to manual trait implementations too. ⁵

⁵

An example of this could be impl Iterator for Foo { fn next(&mut self) -> Option<i32> { ... } } }. The associated type Item can directly be inferred from fn next's return type. If this was allowed, it would bring the semantics of manual trait implementations closer to those done via TAITs.

Road to std

Despite what you might think after the last section, we don't need to wait until we have async traits, TAITS, or variadics in the stdlib to provide a good experience for async concurrency. Instead we can start small, and as more language features stabilize, incrementally expose more functionality.

The most basic form in which we could introduce Join into the stdlib would be by making it available for implementation within the stdlib-only using the C-SEALED future-proofing pattern:

trait Join: private::Sealed {
    type Output;
    type Future: Future<Output = Self::Output>;
    fn join(self) -> Self::Future;
}

mod private {
    pub(super) trait Sealed {}

    pub(super) struct Join2 { ... }
    impl Future for Join2 { ... }
    impl super::Join for (A, B) { type Future = Join2<A, B>; ... }

    // ...etc
}

Because in addition to having the trait be sealed, the returned futures are private too. The only way to reference the associated futures is through impl Future. Which lines up perfectly with async traits, which would allow us to drop the Seal and publicly expose the trait as core::future::Join like this:

trait Join {
    type Output;
    async fn join(self) -> Self::Output;
}

This version still only allows storing the future as impl Future, but once TAITs land can be extended to named futures instead. And then eventually with variadics make the final jump to feature completedness.

In summary: if we wanted to, we could start implementing a fully forward compatible version of Join and related APIs in the stdlib today!

Conclusion

In this post we've established a basic model for concurrently executing multiple futures. We've described an ergonomic new API, prototyped in the futures_concurrency crate which enables concurrent execution of futures in a more ergonomic way than was previously possible. And finally we've shown a path for gradually introducing these APIs into the stdlib in a forward-compatible manner.

For me, the most important point of this post is to show that it's possible to create Futures concurrency APIs which feel natural to use, and can be called with little effort. The exact shape of the traits is less important, and something we can improve on over time. We've shown that we can introduce these methods in the stdlib in a forward-compatible manner, giving us time to polish and iterate on these traits as more language features become available.

One mode of execution we didn't cover in this post is a variation on race in which multiple futures are resolved concurrently, and values are yielded as soon as they're resolved. This is what the futures_rs::select! macro is commonly used for. In a future installment of this series we'll take a look at how we could potentially bring that functionality to the stdlib as well.

Another thing which we haven't covered yet is how to introduce the try variants of the APIs. The obvious choice would be to introduce a TryJoin trait, but that may not quite be the right fit for what we're trying to do. In a future installment we'll look more closely at introducing the try variants.

My plan for the futures-concurrency crate is to keep working on it, and bring it to a state where all the APIs shared in this post have been implemented. Once that's proven to work, and we're sure that this method is forward compatible, we can consider proposing this for addition to the stdlib.

But that's something for the future. For now I hope you enjoy the futures-concurrency crate, and I'd be keen to hear what folks think!

Special thanks to: Sy Brand for help with understanding variadics in C++, Scott McMurray for elaborating on TAITs, and Ryan Levick and Eric Holk for reviewing this post prior to publishing.

Futures Concurrency II: A Trait Approach— 2021-09-02