Why `Pin` is a part of trait signatures (and why that's a problem)
— 2024-10-15
I've been wondering for a little while why the Future::poll
methods takes
self: Pin<&mut Self>
in its signature. I assumed there must have been a good
reason, but when I asked my fellow WG Async members nobody seemed to know off
hand why that was exactly. Or maybe they did, and I just had some trouble
following. Either way, I think I've figured it out, and I want to spell it out
for posterity so that others can follow too.
Why Pin
is part of method signatures
Take for example a type MyType
and a trait MyTrait
. We can write an
implementation of MyTrait
which is only available when MyType
is pinned:
trait MyTrait {
fn my_method(&mut self) {}
}
struct MyType;
impl<T> MyTrait for Pin<&mut MyType> {}
Inside of functions we can even write bounds for this. Shout out here to Eric Holk who showed me that apparently the left-hand side of trait bounds can contain arbitrary types - not only generics or even types that are part of the function signature. I had no idea.
With that we can express that we're taking some type T
by-value, and once we
pin that value it will implement MyTrait
:
fn my_function<T>(input: T)
where
for<'a> Pin<&'a mut T>: MyTrait,
{
let pinned_input = pin!(input);
}
Inside of MyTrait::my_method
, the type of self
will be &mut Pin<&mut Self>
. That's not the same as the owned type Pin<&mut Self>
, but luckily we
can convert that into an owned type by calling
Pin::as_mut
.
The docs contain a big explainer why it's safe here to go from a mutable
reference to an owned instance, which intuitively goes against Rust's ownership
rules.
But what happens now if rather than writing a generic type T
with a where
clause, we instead want to use an impl trait in associated position (APIT). We
might want to write something like this:
// how do we express those same bounds here?
fn my_function<T>(input: impl ???) {
let pinned_input = pin!(input);
}
But we have no way to express that exact bound. Unlike regular generics, APITs
can't express the left-hand side of the bound (lvalue), they can only name the
right-hand side (rvalue). This becomes even more pronounced when we try and use
impl Trait
in return position (RPTIT).
Take for example a function that returns some type T
. Using concrete trait
bounds we can express that this returns a type which when pinned implements
MyTrait
:
fn my_function<T>() -> T
where
for<'a> Pin<&'a mut T>: MyTrait,
{
MyType {}
}
But if we then try and express that same function using RPTIT, we lose the
ability to express that bound. The only solution to express an -> impl Trait
which exposes functionality when it's pinned, is to make Pin
directly part of
the signature on the methods, and not implement the trait for a Pin<&mut Type>
:
trait MyTrait {
fn my_method(self: Pin<&mut Self>) {} // ← note the signature of self here
}
struct MyType;
impl MyTrait for MyType {} // ← no longer implemented for `Pin<&mut MyType>`
And now all of a sudden we can express -> impl MyTrait
whose methods can only
be called when MyType
is pinned. With Unpin
being the opt-out for types
where that is not the case.
fn my_function() -> impl MyTrait { // Can be either pinned or not!
MyType {}
}
Implications
Concretely what this means is that if you want to have a trait that wants to
work with pinned values and work with all language features like normal, you
have to use self: Pin<&mut Self>
part of the method signature. Maybe that's
not a big deal for new traits, but it has implications for every existing trait
in the stdlib.
Take for example the Iterator
trait. We can't just impl Iterator for Pin<&mut T>
and expect RPTIT to work. Instead the expected route here seems like it
should be to introduce a new trait PinnedIterator
that takes self: Pin<&mut T>
. This is a backwards-incompatibility shared by all existing traits in the
stdlib except for Future
, which already takes self: Pin<&mut Self>
. That's a
pretty big limitation, and something that's worth factoring into discussions about the viability of Pin
beyond Future
. For Iterator
it means we would want to mint at least the following variants:
// iterator
trait Iterator {
type Item;
fn next(&mut self) -> Option<Item>;
}
// address-sensitive iterator
trait PinnedIterator {
type Item;
fn next(self: Pin<&mut self>) -> Option<Item>;
}
To run through some more implications of this: if we want to Rust users to be
able to declare address-sensitive types in Rust, then the most likely path now
is a duplication of the traits in the std::io
submodule taking a shape similar
to this:
mod io {
pub trait Read { ... }
pub trait PinnedRead { ... }
pub trait Write { ... }
pub trait PinnedWrite { ... }
pub trait Seek { ... }
pub trait PinnedSeek { ... }
pub trait BufRead { ... }
pub trait PinnedBufRead { ... }
}
Pinning, like async
and try
, is a combinatorial property of traits which leads to an exponential amount of duplication. Lucky for us duplicating traits is not the only possible
path we can take: some form of polymorphism for existing interfaces over Pin
seems possible too - as long as we're willing to change our formulation of it. Which was what lead me to formulate
my design for the Move
auto-trait, which is composable just like e.g. Send
and Sync
.
Conclusion
I want to quickly shout out my fellow WG Async folks. We've been talking about this a bunch, and it's been helpful working through this. Even if it's taken me a couple of months to actually get around to posting this. Any mistakes in this post however are definitely my own.
In this post I mainly wanted to articulate why Future
takes self: Pin<&mut Self>
rather than &mut self
and rely on impl Future for Pin<&mut T>
. I
think I've found a good reason for that, and it has again to do with the left-
and right-hand side of bounds. For me this also confirms my hypothesis that any
design for generalized self-referential types needs to be able to capture the following:
- The ability to mark a type as being immovable
- The ability to transition types from being movable to immovable
- The ability to construct immovable types in-place
- The ability to extend existing interfaces with immovability
- The ability to describe self-referential lifetimes
- The ability to safely initialize self-references without
Option
This post specifically covered the fourth requirement: the ability to extend
existing interfaces with immovability. Addressing that can take the form of duplication of
interfaces (e.g. Iterator
vs PinnedIterator
), or composition via
(auto-)traits (e.g. Move
). Other methods might be possible too, and I would
encourage people with ideas to share them.
PoignardAzur has independently
described why and how
pinned types need to be able to be constructed in-place. Their post showed
examples of both the third and sixth requirements in the list. They introduced a
form of emplacement via the -> pin Type
notation. This is similar to the more
general -> super Type
notation I introduced in my post, which I adapted from Mara's super let
post.
I hope this post helps at least partially clarify why Pin
needs to be
part of interfaces. As well as helped spell out some of the logical consequences
once we consider how it interacts with the stdlib's strict
backwards-compatibility requirements. Because I believe we can and should do
better than duplicating entire interfaces over the axis of immovability.