Fallible Iterator Adapters
— 2020-04-12

A great aspect of the Rust stdlib is that many common operations are provided with a shared idiom. Rather than say having to remember how to write a correct sort function by hand, the stdlib provides a fast one for you.

One of the places with the most such conveniences is the std::iter::Iterator trait. This trait provides a rich set of operations for filtering, combining and merging streams of items. And because iterators are lazy they're incredibly efficient too ¹.

Work doesn't start until a driver method calls it. Examples include next, fold, and for_each. Also because of the way Rust's iterators are setup the compiler can inline them until only the smallest possible state machine remains. Not all languages can do this, and the result is really cool!

However one downside to iterators is that they don't play well with all input. In particular there are some limits around Result (and Try in general). When using any stream adapter with fallible input², we generally want to be able to do several things:

We want to stop iterating (short-circuit) when we encounter an error.
We want to be able to return the error out of the iterator so it can be bubbled up.
We never want to have to ignore errors.

Someone who understands monads better would probably be able to state this more eloquently. But I'm not a language designer, so hopefully I can get away with waving my hands a bit here.

For example with the try_for_each function we're able to take a vec of names, and perform some IO operation on them. This will stop iterating over file names if an error occurs, and will immediately return the error:

use std::io::{stdout, Write};

let data = ["no_tea.txt", "stale_bread.json", "torrential_rain.png"];
data.iter().try_for_each(|x| {
    writeln!(stdout(), "{}", x)?;
    Ok(())
})?;

This works pretty smoothly. Now imagine we wanted to do something with IO operations. Say we had a folder for a database, and we wanted to compare the content of several files with each other, finding the smallest string by index ³.

Yes I know this example is completely unreasonable, but it's late here and I had to come up with something that did some IO. min_by is the actual method a friend brough up when building a database though, so at least we're hitting the right method here (:

Now we would likely want to use something like fs::read and Iterator::min_by, which we could do as such:

let offsets = [1, 4, 9, 18, 22, 17]

let res = db_files.iter().min_by(|x, y| {
    let fx = fs::read_file(format!(".db/{}.index"), x).unwrap();
    let fy = fs::read_file(format!(".db/{}.index"), y).unwrap();
    fx.cmp(fy)
});

Now as you might have noticed, we need to use unwrap here. Oh no! What this practically means is that if any error occurs when reading files our program will panic. Which means that Iterator::min_by probably shouldn't be used here, and we should turn this into manual loop logic instead.

Surveying Fallible Adapters

In order to figure out which methods are incompatible with fallible operations we should take a look at all methods on Iterator. The only methods that have the potential to be incompatible are the ones that take closures.

We've created an overview of all iterator methods that take closures. Here we look at the iterator adapter name, the signature of the closure that's passed, and whether it can short-circuit on error. Finally we also look at whether it has a fallible counterpart that can short-circuit.

In the following overview there are probably ways to make certain methods work with errors, or perhaps they work if Result has been passed into the method, but does not originate from the method. The measure we'll be using is whether the ? operator can be used inside the closure.

Name	Closure Signature	Accepts try operator?	Fallible counterpart
`all`	`FnMut(Self::Item) -> bool`	No	-
`any`	`FnMut(Self::Item) -> bool`	No	-
`cmp_by`	`FnMut(Self::Item, <I as IntoIterator>::Item) -> Ordering`	No	-
`eq_by`	`FnMut(Self::Item, <I as IntoIterator>::Item) -> Ordering`	No	-
`filter`	`FnMut(&Self::Item) -> bool`	No	-
`filter_map`	`FnMut(Self::Item) -> Option<B>`	Yes	-
`find`	`FnMut(&Self::Item) -> bool`	No	`try_find`
`find_map`	`FnMut(Self::Item) -> Option<B>`	Yes	-
`flat_map`	`FnMut(Self::Item) -> impl IntoIterator`	Yes	-
`fold`	`FnMut(B, Self::Item) -> B`	No	`try_fold`
`for_each`	`FnMut(Self::Item)`	No	`try_for_each`
`inspect`	`FnMut(&Self::Item)`	Yes	-
`is_partitioned`	`FnMut(Self::Item) -> bool`	No	-
`is_sorted_by`	`FnMut(&Self::Item, &Self::Item) -> Option<Ordering>`	No	-
`is_sorted_by_key`	`FnMut(&Self::Item) -> impl PartialOrd`	No	-
`map`	`FnMut(Self::Item) -> B`	Yes	-
`max_by`	`FnMut(&Self::Item, &Self::Item) -> Ordering`	No	-
`max_by_key`	`FnMut(&Self::Item) -> impl Ord`	No	-
`min_by`	`FnMut(&Self::Item, &Self::Item) -> Ordering`	No	-
`min_by_key`	`FnMut(&Self::Item) -> impl Ord`	No	-
`partial_cmp_by`	`FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering>`	No	-
`partition`	`FnMut(&Self::Item) -> bool`	No	-
`partition_in_place`	`FnMut(&T) -> bool`	No	-
`position`	`FnMut(Self::Item) -> bool`	No	-
`rposition`	`FnMut(Self::Item) -> bool`	No	-
`scan`	`FnMut(&mut St, Self::Item) -> Option<B>`	No	-
`skip_while`	`FnMut(&Self::Item) -> bool`	No	-
`take_while`	`FnMut(&Self::Item) -> bool`	No	-
`try_find`	`FnMut(&Self::Item) -> impl Try<Ok = bool, Error = E>`	Yes	-
`try_fold`	`FnMut(B, Self::Item) -> impl Try<Ok = B>`	Yes	-
`try_for_each`	`FnMut(Self::Item) -> impl Try<Ok = ()>`	Yes	-

Out of the 63 current stream adapters available, 30 take closures. Of those 30 adapters only 10 can bubble up a Result or have a counterpart that can do so. That means that 20 methods, or almost a third of the functionality in Iterator is currently incompatible with fallible operations.

Implementing missing adapters

If we wanted to expose the full range of functionality to fallible iterators, we would likely need to create try_ counterparts to the stream adapters that currently don't support it. This is a list of methods and their proposed fallible counterparts:

Name	Fallible counterpart
`all`	`try_all`
`any`	`try_any`
`cmp_by`	`try_cmp_any`
`eq_by`	`try_eq_by`
`filter`	`try_filter`
`inspect`	`try_inspect`
`is_partitioned`	`try_is_partitioned`
`is_sorted_by`	`try_is_sorted_by`
`is_sorted_by_key`	`try_is_sorted_by_key`
`max_by`	`try_max_by`
`max_by_key`	`try_max_by_key`
`min_by`	`try_min_by`
`min_by_key`	`try_min_by_key`
`partial_cmp_by`	`try_partial_cmp_by`
`partition`	`try_partition`
`partition_in_place`	`try_partition_in_place`
`position`	`try_position`
`rposition`	`try_rposition`
`scan`	`try_scan`
`skip_while`	`try_skip_while`
`take_while`	`try_take_while`

Conclusion

In this post we've looked at the various adapters std::iter::Iterator provides, talked about error handling, and surveyed how existing methods interact with them. Finally we've provided an overview of methods that currently don't provide support for the try operator in the closure body, and which counterparts would need to be added so they could work.

Some readers may be wondering now whether it's actually worth adding several new methods to Iterator, and whether that's perhaps too much. But I don't think counting methods is a particularly effective way of making API decisions ⁴. Instead it's better to look at overall cohesion and usability.

I think a more useful framework for answering this design question is to ask: "Are we okay that a third of the methods on iterator only work some of the time?" I think the answer there should be a clear "no". If a person wants to perform IO operations while filtering some input, they should be able to, without needing to start over and write manual loops from scratch.

⁴

I've definitely done this a lot in the past, and I was wrong for doing so. Growth.

Fallible Iterator Adapters— 2020-04-12

Surveying Fallible Adapters

Implementing missing adapters

Conclusion

Fallible Iterator Adapters
— 2020-04-12