nesting allocators
— 2023-08-09
I regularly point people to Tmandry's 2021 post "Contexts and Capabilities in Rust". While nobody is actively working on the design at the moment, it's more because of prioritization than anything else. It seems like it could make certain usage patterns a lot nicer in Rust, and one of those is probably working with custom allocators. I've been working a lot with slab allocators recently, and I'd love it if they were easier to work with in Rust. So I wanted to take a moment to talk about allocators, capabilities as a language feature, and why I believe that would work well.
Kinds of allocation
Memory, at least semantically, is best thought of as a contiguous sequence of bytes. Allocators manage these bytes and do the bookkeeping for who is using what. Allocators can take any type of object ("heterogenous types") and place them in memory. And once we're done with an object, the allocator will reclaim the memory. Because not all objects are the same size this can leave gaps in memory, so occasionally the allocator spends some time reorganizing its internals (think: defragmentation). In practical terms, every program wants to have a single allocator to manage the entire program. This root allocator is what we refer to as the global allocator.
But where global allocators are great, they are not perfect. Typically when we write programs we'll see usage patterns in our main loop. For example: if we're writing an HTTP server we will typically allocate data for the request headers and body. And once that request is handled, we de-allocate it and allocate a new request. We can use our knowledge of our usage patterns to handle allocations in a more targeted way. The way we do this is by obtaining a relatively large-ish chunk of memory from our system allocator, and then using that as the heap space for our custom allocator. These custom allocators are also known as "region-based allocators" or "arena allocators". Their key benefit is that their memory management strategies are very simple, which can make allocations so cheap they might as well be free.
There are many different types of arena allocators, but these are the ones I use the most:
- slab allocators: For example the slab crate in Rust. This is ideal for when you perform many homogenous, short-lived allocations over a longer period.
- bump allocators: For example the bumpalo crate in Rust. This is ideal for phase-oriented allocations, where you allocate a number of heterogenous objects, say, on every iteration of a loop and then de-allocate them at the end of it.
While not every crate lends itself equally well to it, it's even possible to combine arena allocators to further customize allocation behavior.
Nesting Allocators
The bumpalo crate comes with an optional collections
feature which
can be enabled to grant access to String
and Vec
types which are identical
to the stdlib's versions other than the fact that they allocate on the bump
allocator. Using it looks like this:
use bumpalo::{Bump, collections::Vec};
let b = Bump::new(); // Create a new bump allocator
let v: Vec<i32> = Vec::new_in(&b); // Allocate on the bump allocator
This is an example of nesting allocators. The system allocator hosts the
bump allocator, and the bump allocator in turn hosts the vector. The main
downside of this is that we're not using std::vec::Vec
, but have to use
bumpalo's custom vec implementation. That's not ideal, not just because we now
have a second Vec
type, but because we also don't have other collection types
such as HashMap
, HashSet
, and every other type out there on crates.io.
Passing Allocators
The plan is to solve this limitation via the nightly allocator_api
feature in
the stdlib. This extends the stdlib with methods such as
Vec::new_in
which can take a custom
Allocator
argument
to host the allocations in. I don't have any particular opinions about the exact
Allocator
trait, but I do have opinions about the way this bubbles up to
collection types.
Roughly speaking, the current design adds a new allocator generic param to
collection types which defaults to the global allocator (Vec<T, A = GlobalAllocator>
). In order to uniformly make use of this, every constructor
method for every collection type needs to provide new _in
variants of their
existing collections. At first that seems reasonable; but it quickly runs into
issues.
The first issue is usage with macros, which we can probably overcome. Macros
like vec!
perform allocations, and in order to work with custom allocators
they should probably take an optional third param for the allocator. The second
issue is a lot harder: traits can't take extra arguments. For example, if you
have an &str
, and you want to convert it to String
but use a bump allocator
you can't keep using the existing into
method. And because the into
trait is
part of core
, we can't just add a new into_in
method to the Into
trait
either. The only real option is probably a new trait in alloc
:
let s = "hello".to_owned(); // 1. Uses the global allocator
let bump = Bump::new();
let s = "hello".to_owned_in(&bump); // 2. Uses a bump allocator
let s: String = "hello".into(); // 1. Always uses the global allocator
// 2. ❌ Unclear how to create a variant
// of this with a custom allocator
While the number of collection types and constructor methods are limited, things
can quickly spiral out of hand when we factor in composition. In order for
things to work uniformly across the ecosystem, every type which holds a Vec
internally would need a second _in
variant for all of their constructors and
conversions. That kind of duplication is uncompelling, meaning it's likely we're
only going to see some types adopt support for custom allocation while the
majority doesn't. Not unlike how #[no_std]
is not uniformly supported
throughout the crates ecosystem today.
Scoped Allocators
What we're really struggling with is that the global allocator acts, in a
capability-sense, as an ambient
authority. Which is to say:
every function in Rust has access to the global allocator without functions
needing to take allocators explicitly as arguments. The allocator is just always
present. Except when we're in #[no_std]
environments, where there is no
allocator. Or when we're trying to change which allocator we're trying to take.
The _in
family of methods is an attempt to remedy the issue of having an ambient
allocator. But that has some drawbacks: most notably the bifurcation of APIs.
But also: it seems like that would be quite annoying to use. Manually having to
pass some form of alloc
argument down to every function call doesn't exactly
seem ideal.
Instead Tmandry's capabilities design seems like a much more pleasant direction. It acknowledges the existence of ambient capabilities, and instead just asks that we express them at the signature level. Because they effectively function as implicits, they just carry through to all argument in scope which need them:
let s = "hello".to_owned(); // 1. Uses the global allocator
with alloc = Bump::new() {
let s = "hello".to_owned_in(&bump); // 2. Uses a bump allocator
}
let s: String = "hello".into(); // 1. Always uses the global allocator
with alloc = Bump::new() {
let s = "hello".into; // 2. ✅ Uses a bump allocator
}
Compositions with functions and types would no longer require manually passing
allocators into arguments, but merely acknowledging the existence of an
allocator in the first place. That would mean we don't have to provide new _in
methods which take allocators, but instead we could keep using the existing
new
methods with some minor modifications.
// Vec's design on nightly today, providing both `new` and `new_in` methods
impl Vec<T, A = Global> {
fn new() -> Self { ... }
}
impl Vec<T, A> {
fn new_in(alloc: A) -> Self { ... }
}
// A design of vec using capabilities, using just a single `new` method.
impl Vec<T>
with alloc = Global,
{
fn new() -> Self {}
}
Obviously questions remain about how to exactly model capabilities. Many types
in the stdlib liberally make use of ambient capabilities (e.g. allocation,
filesystem, networking, etc.) which we'd need to continue allowing to work as it
does today. But there is a question about how we can instead weave through
user-provided capabilities through the entire stdlib and crates.io ecosystem.
The answer requires having a way to continue relying on ambient capabilities,
but opt-out if you want to. Which is not entirely unlike const
.
The final benefit to this is that this would solve one of the current blockers
we have for string-literals. The obvious design for that would be to allow the
s
prefix on literals to create allocated strings. But because that would act
as a language built-in, there is an open question on how to make that work with
custom allocators. Using capabilities would provide a compelling solution for that:
// Rust today
let s = String::from("Chashu");
// With an `s` prefix, allocating on the global allocator
let s = s"Chashu";
// With an `s` prefix, allocating on a local allocator
with alloc = Bump::new() {
let s = s"Chashu";
}
Conclusion
In this post we took a look at general allocation strategies, how allocations can be nested today, which problems that approach has, and how we could simplify that via the novel contexts/capabilities language-feature. The design of contexts/capabilities is incomplete, so we intentionally didn't go too deep into that. Instead we only touched on the top-level semantics, showing specifically how it could be leveraged to overcome some of the problems we have with allocation today.
Fundamentally capabilities don't provide anything new; they're just a fancy way of passing arguments to functions. But I also believe that is their strength. It is just passing arguments to functions. But with significantly less code to write, no method duplication, the ability to play along with trait impls, and a lot more clarity of what your function needs to function.
I don't just think that capabilities could simplify the problem space for
allocators in Rust. I actually believe that it could be the canonical way in
which we solve the current core
/alloc
/std
split in Rust. The core
library really is just std
minus everything which relies on ambient
capabilities. If we could opt-out of relying on ambient capabilities in std
,
the reason to have separate core
, alloc
, and std
libraries should go away.
The other thought I want to share is that I believe capabilities such as allocators are inherently tree-structured. I've recently written about tree-structured concurrency explaining how concurrency is best thought of as a tree. But from the operational semantics side of the language we now tree-borrows as well. I feel like at a high-level thinking about program execution as a tree is accurate, and we should be thinking of (nesting) capabilities such as allocators as trees as well.