Iterator as an Alias
— 2023-11-08
- iterators vs coroutines
- iterators as an alias for coroutines
- fallible functions
- support for pin
- conclusion
This is another short post covering another short idea: I want to explain the
mechanics required to make Iterator
an alias for the Coroutine
trait. Making
that an alias is something Eric Holk brought up yesterday. We then talked it
through and mutually decided it probably wasn't practical. But I thought about
it some more today, and I might have just figured out a way we could make work?
In this post I want to briefly sketch what that could look like.
⚠️ DISCLAIMER: I'm making up a ridiculous amount of syntax in this post. None of this is meant to be interpreted as a concrete proposal. This is just me riffing on an idea; which is a very different flavor from some of my multi-month research posts. Please treat this post for what it is: a way to share potentially interesting ideas in the open. ⚠️
Iterators vs Coroutines
An iterator is a type which has a next
method, which returns an Option
.
Every time you call next
it either returns Some(T)
with a value, or None
to indicate it has no data to return. A coroutine is a variation of this same
idea, but fundamentally it differs in a few ways:
- Coroutines can take arguments to their resume function. Which allows
yield
statements to evaluate to values. - Coroutines can return a different type than they yield. Iterators can only yield a type, but will always return
()
.
Just so we're able to reference it throughout this post, here are the (simplified) signatures:
// The iterator trait
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
// The coroutine trait
trait Coroutine<R> {
type Yield;
type Return;
fn resume(&mut self, arg: R) -> CoroutineState<Self::Yield, Self::Return>;
}
// The enum returned by the coroutine
enum CoroutineState<Y, R> {
Yielded(Y),
Complete(R),
}
Because it's useful to show, here's roughly what (I imagine) it could look like to implement both iterators and coroutines using generator function notation:
// Returning an iterator
// (syntax example, entirely made up, not an actual proposal)
gen fn eat_food(cat: &mut Cat) -> yields Purr {
// ^ ^
// | |
// | The type yielded (output)
// |
// The return type is always `()`
//
for 0..10 {
// yield a `Purr` ten times
yield Purr::new();
}
}
// Returning a co-routine
// (syntax example; entirely made up, not an actual proposal)
gen(Food) fn eat_food(cat: &mut Cat) -> Nap yields Purr {
// ^ ^ ^
// | | |
// | | The type yielded (output)
// | |
// | The return type (output)
// |
// The type `yield` resumes with (input)
for 0..10 {
// Every time we yield a `Purr`, yield resumes with `Food`.
let food = yield Purr::new();
cat.eat(food);
}
Nap::new() // We end the function by returning a `Nap`
}
There are probably better ways of writing this. For example: imo the types
passed to yield
should be a special entry in the arguments list. We can
possibly drop the ->
requirement for iterator-returning generator functions.
And using gen
as a prefix probably isn't necessary either. But that's all
stuff we can figure out later. I at least wanted to make sure folks had had an
opportunity to see what functionality coroutines provide when writing generator
functions.
Iterators as an alias for Coroutines
Practically speaking this means that co-routines are a superset of functionality of iterators. Enough so that if we squint we might think see it work out. We can recreate the iterator trait if we implement a coroutine with this signature:
impl Coroutine<()> for MyIterator {
type Yield = SomeType;
type Return = ();
fn resume(&mut self, arg: ()) -> CoroutineState<Self::Yield, ()> { .. }
}
It's tempting to think we could create a trait alias for this. But we run into
trouble here: the coroutine trait wants to be able to return a CoroutineState
.
While Iterator
is hard-coded to return Option
. And we can't change the existing Iterator
trait to take anything but an option. So what should we do here.
Fallible functions
We can't change Iterator
, but we can change Coroutine
. What if rather than
hard-coding that it returned a CoroutineState
, we instead said that it could
return anything which implements the Try
trait? Try
is implemented for both
not only Option
, but also CoroutineState
! And if we had try
or throws
functions, we could make that configurable like so:
// The coroutine trait
trait Coroutine<R> {
type Yield;
try fn resume(&mut self, arg: R) -> Self::Yield throws { .. }
}
edit(2023-12-01): Yosh from the future here. Scott McMurray helpfully
provided this desugaring of what an un-handwaved desugaring of this signature
could look like. Crucially it shows that we could have a fully generic throws
clause without defining what exactly it throws:
// The coroutine trait
trait Coroutine<R> {
type Yield;
fn resume(&mut self, arg: R) -> <Self::Residual as ops::Residual<Self::Yield>>::TryType { .. }
}
I'm super hand-waving things away here. But roughly what I'm meaning to describe
here is that this can return any type implementing Try
which returns
Self::Yield
, without describing the specific Try::Residual
type. Meaning
that could equally be an Option
, a CoroutineState
, or a Result
. The next
function should not distinguish between those. If we take that path, then we
could use that to rewrite the Iterator
trait as an alias for Coroutine
. We
don't have a syntax for this, so I'll just make up a syntax based on return
type notation:
// A trait alias, hard-coding both the return type and resume argument to `()`.
trait Iterator = Coroutine<(), next(..) -> Self::Yield throws None>;
There's probably a better way of writing this. But syntax is something we can figure out. Semantically this would make the two equivalent, and it should be possible to write a way to make these work the same way.
Also on a quick syntactical note: the throws None
syntax here would operate
from the assumption that the right syntax to declare which type of residual
you're targeting is by writing a refinement. So to use Option<T>
as the Try
type, you'd write throws None
. For Result<T, E>
you'd write throws Err(E)
.
And so on. That way if you strip the throws Ty
notation, you'd be left with
the base function signature. In a fun mirrored sort of way, I don't think we'd
need to prefix fallible functions with a try fn
notation. Just writing
throws
in the return type should be enough. I think the same should be true
for generator functions too; just writing yields T
in the signature should be
enough.
Support for pin
Now, we've gone through an entire post to show how we could turn Iterator
into
an alias for Coroutine
. But there is one more detail we've skipped over: in
the compiler the Coroutine
trait is used to implement future state machines
with, created using the async
keyword. In order for borrows to work across
.await
points in async functions, the Coroutine
trait needs to support
self-references. Today that's done using the Pin
self-type, which means the
actual signature of Coroutine
today is:
trait Coroutine<R> {
type Yield;
type Return;
fn resume(self: Pin<&mut Self>, arg: R) -> CoroutineState<Self::Yield, Self::Return>;
// ^ this changed
}
So Coroutines
must be able to express self-references. And iterators currently
don't support that. Which means we need to find a way to bridge the two. Luckily
this is something we might be able to resolve if we consider pinning as an
orthogonal capability which we can express as a trait
transformer.
Ideally what we'd do is make Coroutine
and by extension Iterator
generic
over: "may be self-referential". I talk about this more in my latest post on the
async
trait;
specifically pointing out that async iterators don't need to be self-referential -
and generator functions probably do want to be able to have self-references.
Making Iterator
an alias for Coroutine
further reinforces that requirement.
Conclusion
This was a quick post to illustrate a way we could make Iterator
a more
narrowly scoped alias for Coroutine
. I'm not sure we should do this; from my
perspective we have a few options:
- Never stabilize the co-routine trait. This is probably fine, since being able
to pass arguments to
yield
or change the return type from the yield type don't seem like major features. Useful, but not critical. - Implement
Coroutine
as a superset ofIterator
. Keeping both, but using blanket impls make both work with one another. It's feasible, but it feels like we're creating "V2" interfaces which aren't interchangeable. You could pass an iterator everywhere a generator was accepted. But you couldn't pass a generator everywhere an iterator was accepted - unless we carve out a special exception just for this, which seems like a pretty bad hack. - Make
Iterator
an alias forCoroutine
. That's what this post is about. It would be the way we bring some of the core unstable functionality out of the stdlib and make it available to users. Without creating an "iterator v2" trait. If we're serious about ever wanting to stabilize Coroutines, this seems preferable.
A nice side-effect of this approach is that we could use this as a way to normalize our syntax as well. Right now we're discussing adding "generator functions" in the 2024 edition. But we're making a distinction between "iterators", "generators", and "co-routines". If we were to adopt the aliasing model, then the iterator trait would be the same as the generator trait. And we would no longer have a need for another, separate concept for "co-routines".
I for one like the idea that a "generator function" returns an impl Generator
.
And even if I don't think that working on that should be a priority, I like that
we might now have a mechanism to make this a reality by creating a trait alias,
hard-coding certain params, and relying on the Try
trait.