Fallible Iterator Adapters
— 2020-04-12

A great aspect of the Rust stdlib is that many common operations are provided with a shared idiom. Rather than say having to remember how to write a correct sort function by hand, the stdlib provides a fast one for you.

One of the places with the most such conveniences is the std::iter::Iterator trait. This trait provides a rich set of operations for filtering, combining and merging streams of items. And because iterators are lazy they're incredibly efficient too 1.

1

Work doesn't start until a driver method calls it. Examples include next, fold, and for_each. Also because of the way Rust's iterators are setup the compiler can inline them until only the smallest possible state machine remains. Not all languages can do this, and the result is really cool!

However one downside to iterators is that they don't play well with all input. In particular there are some limits around Result (and Try in general). When using any stream adapter with fallible input2, we generally want to be able to do several things:

For example with the try_for_each function we're able to take a vec of names, and perform some IO operation on them. This will stop iterating over file names if an error occurs, and will immediately return the error:

use std::io::{stdout, Write};

let data = ["no_tea.txt", "stale_bread.json", "torrential_rain.png"];
data.iter().try_for_each(|x| {
    writeln!(stdout(), "{}", x)?;
    Ok(())
})?;

This works pretty smoothly. Now imagine we wanted to do something with IO operations. Say we had a folder for a database, and we wanted to compare the content of several files with each other, finding the smallest string by index 3.

3

Yes I know this example is completely unreasonable, but it's late here and I had to come up with something that did some IO. min_by is the actual method a friend brough up when building a database though, so at least we're hitting the right method here (:

Now we would likely want to use something like fs::read and Iterator::min_by, which we could do as such:

let offsets = [1, 4, 9, 18, 22, 17]

let res = db_files.iter().min_by(|x, y| {
    let fx = fs::read_file(format!(".db/{}.index"), x).unwrap();
    let fy = fs::read_file(format!(".db/{}.index"), y).unwrap();
    fx.cmp(fy)
});

Now as you might have noticed, we need to use unwrap here. Oh no! What this practically means is that if any error occurs when reading files our program will panic. Which means that Iterator::min_by probably shouldn't be used here, and we should turn this into manual loop logic instead.

Surveying Fallible Adapters

In order to figure out which methods are incompatible with fallible operations we should take a look at all methods on Iterator. The only methods that have the potential to be incompatible are the ones that take closures.

We've created an overview of all iterator methods that take closures. Here we look at the iterator adapter name, the signature of the closure that's passed, and whether it can short-circuit on error. Finally we also look at whether it has a fallible counterpart that can short-circuit.

In the following overview there are probably ways to make certain methods work with errors, or perhaps they work if Result has been passed into the method, but does not originate from the method. The measure we'll be using is whether the ? operator can be used inside the closure.

NameClosure SignatureAccepts try operator?Fallible counterpart
allFnMut(Self::Item) -> boolNo-
anyFnMut(Self::Item) -> boolNo-
cmp_byFnMut(Self::Item, <I as IntoIterator>::Item) -> OrderingNo-
eq_byFnMut(Self::Item, <I as IntoIterator>::Item) -> OrderingNo-
filterFnMut(&Self::Item) -> boolNo-
filter_mapFnMut(Self::Item) -> Option<B>Yes-
findFnMut(&Self::Item) -> boolNotry_find
find_mapFnMut(Self::Item) -> Option<B>Yes-
flat_mapFnMut(Self::Item) -> impl IntoIteratorYes-
foldFnMut(B, Self::Item) -> BNotry_fold
for_eachFnMut(Self::Item)Notry_for_each
inspectFnMut(&Self::Item)Yes-
is_partitionedFnMut(Self::Item) -> boolNo-
is_sorted_byFnMut(&Self::Item, &Self::Item) -> Option<Ordering>No-
is_sorted_by_keyFnMut(&Self::Item) -> impl PartialOrdNo-
mapFnMut(Self::Item) -> BYes-
max_byFnMut(&Self::Item, &Self::Item) -> OrderingNo-
max_by_keyFnMut(&Self::Item) -> impl OrdNo-
min_byFnMut(&Self::Item, &Self::Item) -> OrderingNo-
min_by_keyFnMut(&Self::Item) -> impl OrdNo-
partial_cmp_byFnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>No-
partitionFnMut(&Self::Item) -> boolNo-
partition_in_placeFnMut(&T) -> boolNo-
positionFnMut(Self::Item) -> boolNo-
rpositionFnMut(Self::Item) -> boolNo-
scanFnMut(&mut St, Self::Item) -> Option<B>No-
skip_whileFnMut(&Self::Item) -> boolNo-
take_whileFnMut(&Self::Item) -> boolNo-
try_findFnMut(&Self::Item) -> impl Try<Ok = bool, Error = E>Yes-
try_foldFnMut(B, Self::Item) -> impl Try<Ok = B>Yes-
try_for_eachFnMut(Self::Item) -> impl Try<Ok = ()>Yes-

Out of the 63 current stream adapters available, 30 take closures. Of those 30 adapters only 10 can bubble up a Result or have a counterpart that can do so. That means that 20 methods, or almost a third of the functionality in Iterator is currently incompatible with fallible operations.

Implementing missing adapters

If we wanted to expose the full range of functionality to fallible iterators, we would likely need to create try_ counterparts to the stream adapters that currently don't support it. This is a list of methods and their proposed fallible counterparts:

NameFallible counterpart
alltry_all
anytry_any
cmp_bytry_cmp_any
eq_bytry_eq_by
filtertry_filter
inspecttry_inspect
is_partitionedtry_is_partitioned
is_sorted_bytry_is_sorted_by
is_sorted_by_keytry_is_sorted_by_key
max_bytry_max_by
max_by_keytry_max_by_key
min_bytry_min_by
min_by_keytry_min_by_key
partial_cmp_bytry_partial_cmp_by
partitiontry_partition
partition_in_placetry_partition_in_place
positiontry_position
rpositiontry_rposition
scantry_scan
skip_whiletry_skip_while
take_whiletry_take_while

Conclusion

In this post we've looked at the various adapters std::iter::Iterator provides, talked about error handling, and surveyed how existing methods interact with them. Finally we've provided an overview of methods that currently don't provide support for the try operator in the closure body, and which counterparts would need to be added so they could work.

Some readers may be wondering now whether it's actually worth adding several new methods to Iterator, and whether that's perhaps too much. But I don't think counting methods is a particularly effective way of making API decisions 4. Instead it's better to look at overall cohesion and usability.

I think a more useful framework for answering this design question is to ask: "Are we okay that a third of the methods on iterator only work some of the time?" I think the answer there should be a clear "no". If a person wants to perform IO operations while filtering some input, they should be able to, without needing to start over and write manual loops from scratch.

4

I've definitely done this a lot in the past, and I was wrong for doing so. Growth.