State Machines II: an enum-based design
— 2022-08-23

  1. the case for state machines
  2. representing state machines in rust in the future
  3. ergonomics: overview
  4. ergonomics: merging self-types
  5. ergonomics: merging return types
  6. tying it together
  7. conclusion

In the past I've written about state machines in Rust. And more recently I've also written about anonymous enums 1 in Rust. I've been wondering what would happen if the two interacted? I think the result could actually be quite nice, and worth daydreaming about. In this post we'll be discussing state machines, the language features which could make them easier to use, and ways in which we could push the ergonomics further.

1

"anonymous enums" are also known as "sum types". There might be subtle differences, but I'll be using them interchangeably. For the context of this post though, we assume that taking two members of the same enum and putting them in a sum type results in a sub-set of the original enum rather than creating a new enum with two members. For the sake of brevity in this post, just assume that works as intended.

The case for state machines

A lot of programming is about going from one state to another. We can do that using conditionals statements, but as the state changes those conditions can be tricky to reason about and in turn refactor. Instead it's often more convenient to use state machines: singular objects which represent all possible states and their transitions. And in compiled languages specifically to use compiler-checked state machines 2.

2

See my JavaScript nanostate library for an example of a state machine library which is not checked by a compiler - instead performing all checks at runtime.

Say we wanted to implement a basic green-orange-red traffic light that can also flash. We can imagine the rules for transitions between states could be the following:

  1. Green can turn to orange
  2. Orange can turn to red
  3. Red can turn to green
  4. Red can turn to flashing red 3
  5. Flashing red can turn to red
3

If you've never seen a traffic light blink red before: this is common for traffic lights in the Netherlands. It can indicate that the traffic control systems are malfunctioning. Or it can be used to indicate the traffic lights have been disabled, which can be done late at night when there's hardly any traffic to be managed anyway.

All states and their transitions can be represented by the following graph:

A graph showing how all states point to each other

For the remainder of this post we'll be implementing the rules of the graph above. Our goal is to implement a compiler-checked state machine which can inform us during compilation whether all our state transitions are valid, or if we've accidentally written an invalid transition which we need to fix.

Representing State Machines in Rust In The Future

To read about how we can write compiler-checked state machines in Rust today you can read my previous post on state machines. In this post I instead want to focus on ✨ the future ✨. How could we plausibly represent this in Rust by extending the language. In the "future directions" section of the last post we wrote a state machine without the blinky lights. Using enum variants types (not yet a thing) combined with arbitrary self-types (also not yet a thing), we could write the following:

//! A traffic light without a flashing state.
//! Novel language features: enum-variant-types, arbitrary-self-types

enum State {
    Green,
    Orange,
    Red,
}

impl State {
    pub fn new() -> Self::Green {
        Self::Green
    }

    pub fn next(self: Self::Green) -> Self::Orange {
        Self::Orange
    }

    pub fn next(self: Self::Orange) -> Self::Red {
        Self::Red
    }

    pub fn next(self: Self::Red) -> Self::Green {
        Self::Green
    }
}

fn main() {
    let mut state = State::new(); // green
    state = state.next();         // orange
    state = state.next();         // red
    state = state.next();         // green
}

What I didn't realize at the time I wrote my last post is that this in effect acts like a form of overloading. Functionally the code we've written isn't different than writing a single next function which internally updates the state. But instead we expose the state externally. The following example will be written in valid Rust today, exposing the exact same functionality as our last example — but with the state only observable internally:

//! A traffic light without a flashing state.
//! Novel language features: none

enum State {
    Green,
    Orange,
    Red,
}

impl State {
    pub fn new() -> Self::Green {
        Self::Green
    }

    pub fn next(self) -> Self {
        match self {
            Self::Green => Self::Orange,
            Self::Orange => Self::Red,
            Self::Red => Self::Green,
        }
    }
}

fn main() {
    let mut state = State::new(); // green
    state = state.next();         // orange
    state = state.next();         // red
    state = state.next();         // green
}

Let's continue with the internally observable state examples for a bit. Our graph shows we need to support a flashing state, so we need to add that. We need some way to trigger going into a flashing state, so we'll add a new method: flash. Not all states can transition into the flashing state, so we need to make sure we can only go into it from "red" - which means we need to panic if we attempt to transition into it from any other state:

//! A traffic light with a flashing state.
//! Novel language features: none

enum State {
    Green,
    Orange,
    Red,
    Flashing,
}

impl State {
    pub fn new() -> Self::Green {
        Self::Green
    }

    pub fn next(self) -> Self {
        match self {
            Self::Green => Self::Orange,
            Self::Orange => Self::Red,
            Self::Red => Self::Green,
            Self::Flashing => Self::Red,
        }
    }

    pub fn flash(self) -> Self {
        match self {
            State::Red => Self::Flashing
            _ => panic!("only red can turn into flashing red"),
        }
    }
}

fn main() {
    let mut state = State::new(); // green
    state = state.next();         // orange
    state = state.next();         // red
    state = state.flash();        // flashing
    state = state.next();         // red
}

This is not ideal: we now have added a runtime check to a state machine which was checked entirely during compilation before. The reason a runtime check is needed is because by adding a method which isn't available on all types, the internal state becomes observable externally. And the Rust language lacks the tools to conveniently represent this 4.

4

Yes, I know we can refactor the whole thing and use traits and generics to hack together something which gives us similar semantics. I've shown how to do that in previous posts. The point of this post is not "can we represent state machines using Rust's type system" (yes we can), but "which language features are we lacking to make it easy to represent state machines in Rust's type system".

So let's try and rewrite this again using enum variant types and arbitrary self-types. Here the caller knows exactly what state the enum is in, and we can more precisely represent the states and state transitions:

//! A traffic light with a flashing state.
//! Novel language features: enum-variant-types, arbitrary-self-types

enum State {
    Green,
    Orange,
    Red,
    Flashing,
}

impl State {
    pub fn new() -> Self::Green {
        Self::Green
    }

    pub fn next(self: Self::Green) -> Self::Orange {
        Self::Orange
    }

    pub fn next(self: Self::Orange) -> Self::Red {
        Self::Red
    }

    pub fn next(self: Self::Red) -> Self::Green {
        Self::Green
    }

    // new
    pub fn next(self: Self::Flashing) -> Self::Red {
        Self::Red
    }

    // new
    pub fn flash(self: Self::Red) -> Self::Flashing {
        Self::Flashing
    }
}

fn main() {
    let mut state = State::new(); // green
    state = state.next();         // orange
    state = state.next();         // red
    state = state.flash();        // flashing
    state = state.next();         // red
}

This would allow us to correctly model and validate the entire state graph at compile-time, without needing to resort to any runtime checks. Yay us!

Ergonomics: Overview

Hopefully folks can see the benefit of being able to model state transitions solely through the type system. Obviously there are other ways of authoring compiler-checked state machines today - but that's not the point of this post. What we're trying to think about is how we can do so in a way which feels simple and dare I even say, intuitive for people to use.

I think what we've shown so far is pretty good. But it has a few shortcomings, namely:

  1. It's pretty verbose to individually write each transition
  2. What happens when we have more than one valid return state from a function?

Okay, so bear with me for a sec here. In the graph each line goes from a source to a destination. In the code the sources are represented by the self-types, and the destinations are represented by the return types. This means that any node in the graph with more than arrow pointing from or to it could potentially benefit from refactoring. In the graph above there are instances of this:

  1. orange and flashing point to red
  2. red points to both green and flashing

For any type system enthusiasts reading this post: yes we're going to be talking about representing powerset enums natively using Rust's typesystem here 5.

5

For everyone who isn't a type system enthusiast and doesn't know about "powerset enums", it's basically just "subsets of enums". Instead of declaring that a method can return all variants in an enum, using powersets we can declare that a method can only return a specific subset. Check out the powerset_enum crate to read more about it. Or alternatively you can keep reading, because that's basically what we'll be implementing.

Ergonomics: Merging Self-Types

Now let's start with the first one: "orange" and "flashing" both point to "red". So far we've been talking about arbitrary self-types as a feature. But there are ideas floating around about expanding this for structs, called: view types. What this allows you to do is inform the borrow checker we don't actually need access to the entire struct - we only need access to parts of it. So we can hold multiple mutable borrows on the same struct as long as the fields don't overlap.

Now what if we had the equivalent of these struct "views" but for enums. Instead of "viewing" subsets of fields, we'd be able to "view" subsets of enums. Now let's make up a syntax to show what we mean. In patterns we already have Foo | Bar to represent multiple patterns, so let's reuse that here:

// before
impl State {
    pub fn next(self: Self::Orange) -> Self::Red {
        Self::Red
    }

    pub fn next(self: Self::Flashing) -> Self::Red {
        Self::Red
    }
}
// after
impl State {
    pub fn next(self: Self::Orange | Self::Flashing) -> Self::Red {
        Self::Red
    }
}

I know many of you will likely have opinions about the syntax, but let's not focus on that for a minute. Instead I want to focus your attention attention on the semantics at play here: it would bring the same pattern semantics as used in match statements to methods, which I believe should feel a lot more natural?

Riffing time. There are some real questions about what syntax view types should use. Niko's post suggests something like &{golden_tickets} self. But what if instead we chose to reuse the pattern syntax for the self types 6 - which could yield us something like &Self { golden_tickets, .. }. This would reuse the pattern-destructuring syntax. Which is similar to what we're doing here: reusing the pattern enum matching syntax. It feels like there might be something to it?

6

My colleague eric suggested exploring this direction a while back. And I actually quite like it! (Direct any credit to Eric, but blame to me please. I've definitely run with this way past what he likely intended.)

The compiler knows that the possible members of self in the example ("orange", "flashing") are only a subset of all members in State. What if the compiler actually used this information in the function body too? That would mean matching on self would only need to account for the possible cases rather than all cases:

match self {
    Self::Orange => {}
    Self::Flashing => {}
    // no other cases need matching!
}

This isn't particularly relevant yet for the simple traffic light example we're dealing with here, but definitely becomes a lot more relevant when working with real-world code. No longer needing to panic! for branches you know can't be reached anyway would make code both simpler and faster, which is a pretty big deal!

Ergonomics: Merging Return Types

The second case is: "red can point to both green and flashing". This one will be a bit harder to reason about because instead of taking multiple different parameters we're trying to return multiple parameters. This means the compiler can no longer statically know which variant we have, and instead we have to perform runtime matching to figure that out. Right now the code we have looks like this:

// before
impl State {
    pub fn next(self: Self::Red) -> Self::Green {
        Self::Green
    }

    pub fn flash(self: Self::Red) -> Self::Flashing {
        Self::Flashing
    }
}

We have two different methods which take Self::Red and return a type. For argument's sake say we wanted a single method with took some data to determine whether we should go to green or flashing 7. This could then look something like this:

7

"For argument's sake" means I couldn't come up with a good example. So this is def a cop-out.

// after
impl State {
    pub fn next(self: Self::Red, data: Data) -> Self::Green | Self::Flashing {
        if validate_data(data) {
            Self::Green
        } else {
            Self::Flashing
        }
    }
}

Again, the syntax is not really the point. The idea here is though that just like in the earlier example the returned value would be known to be a subset of Self: it would still register as the State enum, but the only two accessible members are Green and Flashing 8.

8

Maybe a better way to signal this would be to use something like -> Self { Green | Flashing } as syntax? Idk,, I'm focusing more on semantics than syntax right now.

On the usage side things would change too: we could no longer statically know which state we're in. All we know is that the state is either green or flashing. This leads to some very real questions: should we be able to then call next again to get to orange | red?

fn main() {
    let mut state = State::new(); // green
    state = state.next();         // orange
    state = state.next();         // red
    state = state.next(data);     // green | flashing       - ok I guess?
    state = state.next();         // orange | red           - uhhh,,
    state = state.next(?data);    // green | red | flashing - x_x
}

Once we're in this state, we have different next methods. How do we decide how we should call them? Even if we can answer this question, the semantics of this seem pretty complex and hard to teach. So instead I propose we apply some basic rules here to simplify usage:

  1. Once a variable is assigned to a single enum-member-as-value, it can only be re-assigned to other single values and vice-versa.9
  2. Methods with the same name on the same enum need to have the same "base" (type from which a view is derived) signature.
9

Maybe this rule could be relaxed over time? I'm not sure; it seems very graph-problem-y, and I don't know if there are any great solutions to this.

The first rule would disallow assigning green | flashing to state. Instead we'd need to match to know which variant we have, and go from there:

//! Never assign an enum view-type to `state`. This requires matching on the data
//! until we've extraced a single member. Assume we wait for conditions to change
//! between calls to `state.next`.

fn main() {
    let mut state = State::new();                    // green
    state = state.next();                            // orange
    state = state.next();                            // red
    state = handle_flash(state.next(&data), &data);  // green
}

/// Keep looping until we hit a green state.
fn handle_flash(mut state: State::Green | State::Flashing, data: &Data) -> State::Green {
    match state {
        State::Green => State::Green,
        State::Flashing => {
            state = state.next();              // red
            to_green(state.next(&data), &data) // green
        }
    }

}

There are a bunch of assumptions baked in here - not in the least that an anonymous enum of State::Green | State::Flashing can be passed to an enum view-type matching the same members. But we're not doing details in this post. We're doing vibes. Yeehaw.

Okay, so for the second rule we're proposing: methods with the same name on the same enum should share the same base signature. This means that methods which return a view of some enum or struct are fine - because they're the same "base" type. But having different methods with different signatures like we've done here with next(data) would be a no-no. This would allow us to keep thinking about methods which take or return enum views as a form of specialization.

That last point is perhaps not strictly required, but it has the benefit of making the enum feel more consistent. And it could lead allow for specialization-like behavior where if you implement the same method for all enum members, it then becomes unconditionally available on the entire enum - since it can be called regardless of which state you're in, even if it yields different behavior. But on a more intuitive level: it feels pretty weird that self.next() sometimes takes an argument and sometimes doesn't. It kind of feels like parameter overloading like it's done in other languages, and Rust kind of steers clear of that. So it feels better to just enforce that methods with the same name on the same enum need to take the same arguments - even if we can wiggle around with view-types and sub-types a little.

Tying It Together

Phew, that's been a ride. The last section has been a lot about what not to do. So let's show an example of how things might look when we put them all together. Let's start with what we had before, but slightly adapted to make the flash case conditional:

//! A traffic light with a flashing state.
//! Novel language features: enum-variant-types, arbitrary-self-types

enum State {
    Green,
    Orange,
    Red,
    Flashing,
}

impl State {
    pub fn new() -> Self::Green {
        Self::Green
    }

    pub fn next(self: Self::Green) -> Self::Orange {
        Self::Orange
    }

    pub fn next(self: Self::Orange) -> Self::Red {
        Self::Red
    }

    pub fn next(self: Self::Red) -> Self::Green {
        Self::Green
    }

    // new
    pub fn next(self: Self::Flashing) -> Self::Red {
        Self::Red
    }

    // new
    pub fn flash(self: Self::Red) -> Self::Flashing {
        Self::Flashing
    }
}

fn main() {
    let mut state = State::new(); // green
    state = state.next();         // orange
    state = state.next();         // red
    if validate_data(data) {
            state = self.next();  // green
            // handle green
        } else {
            state = self.flash(); // flashing
            // handle flashing
        }
    }
}

Now if we translate that to our enum views variant, we're able to inline the conditional statement into the enum. And together with the arbitrary self enum pattern syntax, the overall code is a lot more trimmed. And especially the usage should feel a bit more natural:

//! A traffic light with a flashing state.
//! Novel language features: enum-variant-types, arbitrary-self-types,
//!                          enum-view-types, enum-sum-types

enum State {
    Green,
    Orange,
    Red,
    Flashing,
}

impl State {
    pub fn new() -> Self::Green {
        Self::Green
    }

    pub fn next(self: Self::Green) -> Self::Orange {
        Self::Orange
    }

    pub fn next(self: Self::Orange | Self::Flashing) -> Self::Red {
        Self::Red
    }

    // New: Inlines the logic of the previous example to decide whether to go
    //      from "red" to "green" or "flashing".
    pub fn try_next(self: Self::Red, data: Data) -> Self::Green | Self::Flashing {
        if validate_data(data) {
            Self::Green
        } else {
            Self::Flashing
        }
    }
}

fn main() {
    let mut state = State::new();   // green
    state = state.next();           // orange
    state = state.next();           // red
    state.try_next(data) {
        State::Green => {},         // handle green
        State::Flashing => {},      // handle flashing
    };
}

Conclusion

In this post we've looked at how using "enum variant types", "enum sum types", and "enum view types" could allow us to make state machines at the type level in Rust easier to author and more ergonomic. The example we've used here has been a variation of a basic traffic light. But despite that we expect that it should be possible to adopt the patterns and features shown here to state machines in general.

Challenges

It seems that taking enums as self-types should be relatively straight-forward. But returning anonymous enums is more complicated. I came up with some rules to help circumvent some of the sharp edges, but I feel those probably could use some more thought. Maybe folks reading this have some insights on it?

Making Enums More Useful

Also as my colleague Arlie pointed out during a recent chat: having enums-as-types would be really beneficial for other reasons too. Take for example the syn crate 10: the WherePredicate has a member Type which contains a struct called PredicateType. The reason why we have a separate struct instead of just inlining the data in the enum member, is because members can't be used as types. So if we want to author functions which operate on the data in WherePredicate::Type, we need to create a separate struct 11. This makes record-enums less useful than their struct counterparts, and complicates the API surface area of crates.

10

I picked this as an example, so if there are actually different instances where PredicateType is used - that's on me. I'm not an expert on syn so details may be wrong. But I hope the example is clear enough that at least the point comes across: because this does happen in a lot of places.

11

Ideally we should only create need to create separate structs if we want to reuse types between enums. That's not the case today.

Structural Typing and Errors

While we're on the topic of anonymous enums in general: another important use case for it would be for error handling. Anonymous enums provide us with structural typing for enums 12, which allows for convenient composition of error types. Say we have a ParseError and io::Error, a function should just be able to specify -> Result<T, ParseError | io::Error>. But more importantly: we should be able to specify which error variants can be returned. This is a bit harder with io::Error specifically because of how it's constructed. But would work great when using errors returned by thiserror. I've been meaning to write about error handling ever since I posted my error handling survey three years ago. Maybe I'll do that later this year.

12

You can think of "structural typing" as "matching on the shape" instead of "matching on the type". Matching on a tuple with (first, second) is structural: we don't care what types are in the tuple, we mostly care about the shape. It still needs to type-check of course, but combined with type inference it can give us a lot of flexibility. For example TypeScript's type system is structural, and in turn provides a lot of flexibility.

View Types

Also, while writing this post I realized that being able to use enum variants as self-types is kind of analogous to view types for structs. Both carve out subsets of their base type, but for different purposes. Which makes sense because they're actually different. But as a general trend the implications seem fun: by more precisely articulating which sub-types we use, the compiler can allow more code to function. I think the reason for that is that function boundaries act as strict guards for what's allowed, and as a rule are also pretty coarse-grained. By making the more expressive, the compiler becomes less strict, and in turn more code can be allowed. It really does feel like view types and the like might represent a generational leap akin to NLL in terms of general language ergonomics. I think it's interesting to think about what else could be loosened within the framework of Rust's existing semantics.

Goals of this post

To re-iterate: this post is intentionally incomplete. I've knowingly glossed over details, and left pieces out. The goal of the post was to show what the outcomes we could have if we combined some features folks are thinking about anyway. Together with a personal goal to not spend months writing about this, because I actually should be working on diffferent things. Rust's roadmap is always a balance of team prioritization and personal interests; so showing what can be achieved by combining several features might lower the bar to make the case these features would be useful, or perhaps even get people interested in working on this.

Wrapping up

Anyway, that's probably enough discussion for now. This post was a massive procrastination from writing another post on "halt points", which I've finished drafting and am currently editing. This was is mostly thoughts which have been stewing in the back of my mind after talking with colleages at work about state machines, type systems, and more accurately modeling programs. Thanks for reading!