What is WASI?
— 2023-06-02
      
     
          - introduction
- what is webassembly?
- bytecode, webassembly, and the jvm
- what are webassembly components?
- what is wasi?
- categorizing webassembly
- iterating on wasi
- the future of wasi
- conclusion
- appendix a: terminology cheat sheet
Introduction
Now that the final touches are being put on the second version of WASI (WASI Preview 2), I figured now might be a good time to write a few words on WASI, WebAssembly, and how to think about them. In recent months I've been driving the WASI Preview 2 work in the Rust compiler. So in order to do that I've had to familiarize where things are currently, where things are headed, and how build a mental model for myself. Because while Wasm and WASI have been around for a while, WASI especially has undergone a lot of changes. Enough so that I figured it might be helpful to write a little about.
What is WebAssembly?
WebAssembly is a bytecode format which compilers can compile programs into. You can think of "bytecode" as an intermediate format which can be converted to native formats such as x86 assembly, which is what most desktop CPUs run, or ARM v8 assembly which is what most mobile phones run. This is useful because it means you can "compile once" to WebAssembly Bytecode, and then run that code on any number of targets - provided they have a runtime which can interpret WebAssembly and convert it to native code.
WebAssembly was initially designed for web browsers, to provide a way to execute untrusted native code in a sandboxed environment. But this use case has since expanded, because it turns out that a high-performance trusted sandbox is useful abstraction for a lot of things - including creating sandboxed environments for networked applications.
The best way to think about the space WebAssembly runtimes occupy is roughly at the same level as "docker containers" or "VMs". But rather than virtualizing an entire operating system and managing that, WebAssembly runtimes use bytecode to virtualize single applications. Where a single computer may concurrently execute tens of VMs or hundreds of containers, it should be possible to run thousands of WebAssembly programs 1.
These are obviously not exact numbers, but more of a broad illustration of scale. I know that for example FireCracker VMs are more performant than regular VMs, and that people have successfully launched a million plus docker containers in contests. But those are not typical cases, and I believe in the typical cases what I'm saying should be broadly accurate.
bytecode, WebAssembly, and the JVM
WebAssembly is sometimes compared to Java and the JVM. This makes sense, because the JVM is a popular platform which also interprets bytecode. The JVM not only supports Java but any language which compiles to JVM Bytecode, including Kotlin and Scala. However despite WebAssembly and the JVM both executing bytecode, there are some key differences between the two:
- WebAssembly was designed to be targeted by native languages as C/C++/Rust in mind, while the JVM is more oriented towards 2 garbage-collected languages.
- WebAssembly is a royalty-free W3C standard, while the JVM somewhat famously isn't. This means implementing WebAssembly support in language toolchains and runtimes is not only possible, it's the exact goal of the project.
- WebAssembly was designed from the ground up with with strict sandboxing as a core priority while the JVM wasn't. In order to enable secure multi-tenancy with the JVM it can be advisable to wrap it in another isolation layer - such as a VM.
This changes a bit with the introduction of GraalVM, but I think it still holds largely true.
Note though that I don't mean to harsh the JVM at all with this. WebAssembly and the JVM were target very different use cases, have different priorities in their design, and in turn excel at very different things.
What are WebAssembly Components?
WebAssembly as bytecode format is also often referred to as "Core WebAssembly". When you compile a "Core WebAssembly" program it is converted into something called a "WebAssembly Module". You can roughly of think of this as an object file in traditional compilation models. But unlike classic objects, Wasm modules can't describe system calls or reference any external symbols - they can only take numbers in and put numbers out, and that's about it. If you're trying to make Wasm programs do anything other than sum up numbers, you need more than just Core WebAssembly.
"WebAssembly Components" are a typed wrapper around Wasm Modules. Rather than reasoning about numbers in/numbers out, they provide a way to talk about types, functions, methods, and namespaces. The way this is done is via an IDL format called WIT (Wasm Interface Types) . Here's an example of a "monotonic clock" interface taken from the preview2-prototyping repo:
default interface monotonic-clock {
    use poll.poll.{pollable}
    /// A timestamp in nanoseconds.
    type instant = u64
    /// Read the current value of the clock.
    ///
    /// The clock is monotonic, therefore calling this function repeatedly will
    /// produce a sequence of non-decreasing values.
    now: func() -> instant
    /// Query the resolution of the clock.
    resolution: func() -> instant
    /// Create a `pollable` which will resolve once the specified time has been
    /// reached.
    subscribe: func(
        when: instant,
        absolute: bool
    ) -> pollable
}
A Wasm Component is a single typed object consisting of a Core WASM Module plus
corresponding WIT definitions. Components can specify they either export
specific interfaces, or require that other interfaces are imported. For
example, I could write a binary program which prints the number of seconds
elapsed, which exports a main function with no arguments, and imports both
the stdout and monotonic_clock interfaces.
One way to think about Wasm Components is: "What if we had ML-style modules in our linker?", and that thought was taken all the way through from the system call layer to the way libraries are linked to each other. With the added benefit that each Wasm Component operates as its own security boundary, using a "shared-nothing" approach to ensure isolation.
What is WASI?
WASI stands for the "WebAssembly System Interface". Some people have recently also started dubbing it the "WebAssembly Standard Interfaces", since it covers far more than just operating system interfaces. The way I think about this is as sets of standard Wasm Components which can be implemented by any vendor and targeted by any toolchain. This can include APIs such as socket-based networking, or filesystem access. But also APIs such as HTTP-based clients and servers, or even message queue interfaces.
Not all WASI interfaces are created equally though. For example: a serverless environment may not want to expose direct access to the filesystem. Or the stdlib of a programming language may not want to provide access to message queue APIs. This differentiation in goals is why WASI has a notion of "Worlds". The different "Worlds" are still in the process of being defined, but the following two are actively being worked on right now:
- A world modeling a traditional operating system, providing access to sockets, stdio, filesystem access, etc.
- A world modeling a "bursty" environment, typically best suited for serverless applications. This will include access to handling HTTP requests, making HTTP requests, access to object storage, message queues, etc.
The way WASI interfaces are standardized is via the W3C's WebAssembly working group. People from across the industry come together to work on defining these interfaces, which once accepted are then publish as a standard. This is "standard" with a capital s. The standardization process can take time, but it has the upside that once ratified you can know for a fact just about everyone in the space will be adopting it.
Categorizing WebAssembly
Now that we've talked about what Wasm and WASI are, let's talk about what they aren't. Or well, perhaps more: what they aren't just. WebAssembly falls into multiple categories all at once, which means it also kind of escapes categorization entirely. Depending on which angle you take, you can think of WebAssembly as:
- A programming language: The WebAssembly text format is based on S-Expressions, and it can be authored by hand if you want to. WIT is a similar format targeted towards linkers, and incorporates a lot of work from the ML world.
- An operating system: Whether you're running WASI on Windows, on Linux, or directly on hardware - the host you're targeting is WASI, and it should behave the same everywhere. From a programmer's perspective, WASI becomes the operating system. The exact details of what's backing the runtime shouldn't matter.
- A virtual machine: Wasm provides strict sandboxing guarantees, meaning that you're able to trust a WebAssembly runtime with the same guarantees as you would trust a VM. A Wasm program cannot ever jump out of its current memory, and access memory of a different program running on the same computer. But it goes one step further:
- An application runtime: Part of WASI are definitions for service-level APIs such as "HTTP handling" and "message queue access". APIs which wouldn't be out of place in an application runtime such as dotnet, or a cloud vendor such as Azure.
- A single-host container orchestrator: rather than dynamically linking applications on the same host via HTTP, WASI provides an alternate model to dynamically link programs without sharing any memory.
- A compiler: WebAssembly itself is a bytecode format which is the target of language toolchains. But the Wasm runtimes themselves need to take that bytecode and convert it to machine code, which means there is a lot of compiler work involved.
- An ABI, memory model, and linker: Wasm Components in many ways are the platform ABI. And because of how sandboxing has been implemented, one could argue that Wasm has its own memory model. And the linker is basically an ML module system in disguise. All put together, it's basically its own model of a system, which behaves quite differently from most traditional platforms.
In "Embrace the Kinda" SunfishCode talks about the categorical ambiguity of WebAssembly in more detail.
Iterating on WASI
The latest version of WASI currently available is called: "WASI Preview 1". For the past four years people have been hard at work on defining "WASI Preview 2", and everything I've written about in this post has been about Preview 2. The Preview 1 version of WASI was much closer to just an operating system layer. But it quickly became clear that this would hit some pretty big limitations, and if WASI wanted to live up to its stated goals, it would need to change.
Preview 2 is a complete rework of Preview 1, introducing WIT, WASM Components, and all sorts of new standard interfaces. What it doesn't yet do is provide first-class async primitives, which is scheduled for WASI Preview 3. Threads are also missing from Preview 2, and work on that is still ongoing.
One neat thing of the way WASI is structured is that virtualization layers can be nested. In order to make upgrading between WASI versions easier, a shim, a shim will be provided which allows Preview 1 code to continue working in hosts which only implement Preview 2.
Preview releases are intentionally backwards-incompatible, and the idea is for them to be deprecated over time. Eventually the plan is to release a "WASI 1.0" specification which will provide more stability. The hope, at least, is that with each WASI preview release, the number of changes between major versions will shrink - so that the final 1.0 release will represent mostly a formalization of what people have already been using for a while.
The future of WASI
As mentioned, WASI Preview 2 doesn't yet have a model for first-class async, or for threading. But it also doesn't yet provide any reasoning for multi-host (distributed) linking of programs, nor are all of the WASI interfaces fully specified yet. These are all things coming down the pipeline, which I'm really excited about. Specifically for first-class async, the plan is to model that using a structured model. Having access to that in
Beyond core WASI features, there is a lot of other interesting work ongoing. Maybe most interesting in my opinion is the paper: "Going beyond the Limits of SFI: Flexible and Secure Hardware-Assisted In-Process Isolation with HFI" which was a joint project between UCSD, Intel, and Fastly to provide a new set of instructions to make in-process sandboxing faster and cheaper. This should greatly benefit WebAssembly, allowing it to sandbox with even less overhead.
Work on WASI Preview 2 is currently ongoing, and should be released later this year. On behalf of Microsoft I'm currently working on the Preview 2 implementation for rustc together with folks from Fastly and Fermyon. But work is simultaneously happening for other language toolchains, runtimes, and platforms as well. With WASI Preview 2 including async socket support (though not yet multi-threading), I think this may finally be the year WebAssembly is finally going to start living up to its expectations. And I'm incredibly excited for that!
Conclusion
The way I roughly think about WebAssembly is as a hopeful vision of what computing can be. It represents an opportunity to take the last 40 years of operating system research, compiler development, and industry experience and combines that into a form that is both coherent and accessible. I hope this post can provide some insight on what WebAssembly and WASM are, and where things are headed.
Appendix A: Terminology Cheat Sheet
- Wasm: short for "WebAssembly".
- WebAssembly, or "Core WebAssembly": A bytecode format which can be translated to native code by a "WebAssembly interpreter" or "WebAssembly runtime"
- WASI: short for "WebAssembly System Interface". Sometimes also referred to as: "WebAssembly Standard Interface". It provides a set of standardized component interfaces using the WIT IDL language.
- WIT: short for "WebAssembly Interface Types" is a language to define inter-component interfaces. It is used as part of WASM Components.
- IDL: "Interface Description Language" is a meta-language which defines program interfaces. WIT is an example of an IDL.
- WASM Module: best thought of as "object files" for WebAssembly. It's a single binary object containing core WASM bytecode.
- WebAssembly Component: a typed wrapper around a WASM Module. It combines a typed WIT interface and WASM module to create a typed component.
- WebAssembly Worlds: A combination of different WASI interfaces, packaged up into single environments. The brunt of development work in this space is currently happening to define an "operating system world", and a "bursty world" (serverless).
- WASI Preview 1: The snapshot of WASI development which was released in 2019. This wasn't originally versioned and was previously just referred to as "WASI".
- WASI Preview 2: The version of WASI which is being releasing later this year, built on top of WASM Components.
- Linker: A program which glues a number of other programs together into a single program. The parts which are glued together are usually called "libraries" or "objects", and the output program is usually called a "binary". With WASM Components the differences between libraries and binaries are much less clear.