Reasoning about ABIs
— 2023-11-18

Note: this post is not my usual research posts; but more a loose collection of thoughts I've been thinking of recently. I might start tagging these as "short", since I'm writing and publishing before fully validating.

Not long after I started programming I developed an intuition for what APIs are: they are the interfaces we use in our applications to communicate between distinct components. But there was this other term, "ABI", which I didn't quite understand. "Application Binary Interface" is not very descriptive if you haven't used it before. Does it mean "binary" because it's encoded? But APIs use encodings too. If an ABI is separate from an API, are we not supposed to program against ABIs? What are the differences?

Over the past few months I've been helping out with getting WASI Preview 2 over the finish line¹. Part of WASI Preview 2 are Wasm Components and WIT definitions, which both are part of an overarching system called "The Component Model". In broad terms, Components in Wasm are able to communicate with one another via a stable ABI encoding, which is accompanied by WIT (an Interface Definition Language) that can be used to describe system interfaces. For example like so:

ETA for WASI Preview 2 at the time of writing: either December this year or January next year; it's November now.

package example:cat;

interface cat {
  meow: func() -> string;
}

world cats {
  export cat;
}

This defines a package cat in the namespace example, providing a set of export cats, which includes the cat interface. In the component model we know how to take this interface, and encode this as a Wasm Component. Which can then be imported directly by other components, and have bindings generated for it for any language via wit-bindgen. WASI (WebAssembly System/Standard Interfaces) are all defined in terms of WIT, and describe standard ways of interfacing with things like system clocks, filesystems, networks, and in the future caches and queues too.

I'm really used to thinking of ABIs in terms of data encodings and byte offsets. In Rust we can slap pub extern "C" on any type to change the way it's laid out in memory. But it has limitations; like for example it doesn't know what to do with async fn. In the past I've also worked with other IDL formats such as WebIDL, COM, and WinMD. And while all of these have helped me understood what an ABI does, they haven't really helped me understand what an ABI fundamentally is. But now that I've worked with WIT, I now believe that the better way to think of ABIs as just a different application of type systems:

Application Binary Interface: A type system with a specified serialization format and calling convention ².
Programming Language: A type system with runtime behavior (operational semantics).

A "calling convention" determines the way functions are invoked; how arguments are mapped to registers, etc.

The Wasm Component model doesn't just define types like "string" or "record", it also has a (limited ³) notion of generic types too. For example result and option are generic types, so you can define functions which return a result<string, error> to indicate they're fallible. I never considered that ABIs could contain ADTs ⁴, but WIT clearly has it. Which to me signals that (with some constraints), we can apply the wealth of type system theory we've built up over the years to ABIs as well.

There are currently four built-in generics in WIT. It's unclear whether it'll support user-defined generics in the future since that would complicate implementations. I'd love to see it though.

⁴

"Algebraic Data Type" - which are types which can carry data. These include both enums (sum types) and structs (product types) in Rust. What's cool about them is that they're compositional because an ADT can carry more ADTs.

The boundaries between ABIs and programming languages in practice can often get murky though. As we've seen with extern "C" in Rust, programming languages can define ABIs internally as well. And vice versa: some ABIs may depend on runtime semantics of a language too. But in its broadest terms: I believe an ABI must define a data encoding, but doesn't need to define runtime semantics. And a programming language must define operational semantics, but doesn't have to define an encoding ⁵. And I feel like that provides a much crisper way to think about what ABIs actually are, rather than just what they're used for?

⁵

A data encoding is still necessary to actually encode instructions on a machine; but that doesn't need to be part of the programming language - it can be an implementation of the compiler. And multiple compilers for the same language can truthfully compile the same language, despite not needing to agree on the encoding. As long as the operational semantics end up working as specified.

Two Kinds of ABIs (added 2023-11-18)

I think it's fair to say there are two ways to think about ABIs. One is just as the low-level encoding of types + calling conventions. The WASI Component model has a document dedicated to what is called "the canonical ABI" which only specifies the encoding and calling conventions.

The way I'm talking about ABIs here in this post is one step above that, where we not only consider the WASI canonical ABI, but also the WIT IDL format as a part of "ABI". Both are designed with each other in mind, and having WIT without the canonical ABI doesn't make much sense. This feels very similar to my experience working on windows-rs, which ingests WinMD definitions and knows how to project those into the WinRT calling convention. Maybe it makes sense to just have ABIs refer to the low-level calling convention, and have a different name for the IDL + ABI system? But given how tightly these are linked, I'm not sure it's worth distinguishing between the two? Or whether talking about it in terms of like: "ABI encoding" and "ABI description" are enough? If anyone knows of better terminology for this, I'd love to hear it!

Why ABIs require a type system (added 2023-11-18)

Jonathan Pallant asked some really good questions on Mastodon about whether high-level IDLs such as WIT can conceptually be separated from the underlying encoding. From their perspective as an embedded systems engineer, ABIs are about object encodings + calling conventions. Since from their perspective it was primarily about encodings, the need for a higher-level IDL or a type system didn't really make sense. Which is a really fair perspective, and it made me scratch my head a little. But I think I've found an explanation for why the two can't be separated.

Take for example C. There is the programming language "C", and then there are separate calling conventions. For example on POSIX x86-64 platforms it uses the "x86-64 System V" calling convention. On 32-bit POSIX it uses the 32-bit "System V" calling convention. Gankra has written a great post about the C language and the importance of ABIs.

I believe that when we talk about "C ABIs" (plural as there is no canonical ABI), it's not enough to just point at the encoding and calling conventions. As Gankra covered in her post; it's also important that there is a shared understanding of types. You can't implement the "C ABI" if you don't also encode what an int in C is. If intmax_t was not part of the "C ABI", then it would not be an issue to change it either.

Taking this back to Wasm Components: while there is something called the "Canonical ABI" which defines the encoding of Wasm Components. If anything wants to implement the "Wasm Component ABI", they can't implement the encodings provided by that. WASI defines a number of built-ins which have a stable encoding, and thus can be considered part of the ABI too.

This to me feels like the most convincing argument for why when we talk about ABIs we cannot meaningfully separate the higher-level language (C or WIT) from the lower-level encoding ("x86-64 System V" or "WASI Canonical ABI"). The types defined in the higher-level language are a part of the overall contract we call "ABI" ⁶. And those types cannot be defined without also defining a type system to define those types in.

⁶

The abi-cafe project is a good example here. It tests the compatibility of various projects which output C ABI. And one of its trophy cases is an incompatibility between x86 linux clang, and gcc on how __int128 should be encoded when passed on-stack. And that can only be an issue if __int128 is considered part of the C ABI.

Reasoning about ABIs— 2023-11-18

Two Kinds of ABIs (added 2023-11-18)

Why ABIs require a type system (added 2023-11-18)

Reasoning about ABIs
— 2023-11-18