17

I know we can use collect to move a Result from inner to outer, like:

fn produce_result(my_struct: &MyStruct) -> Result<MyStruct, Error>;

let my_results: Vec<MyStruct> = vec![];
let res = my_results.iter().map(|my_struct| produce_result(&my_struct)).collect::<Result<Vec<MyStruct>, Error>>;

which propagates error from the closure to the outer.

However, this method doesn't work in flat_map case (Rust playground):

fn produce_result(my_struct: &MyStruct) -> Result<Vec<MyStruct>, Error>;

let my_results: Vec<MyStruct> = vec![];
let res = my_results.iter().flat_map(|my_struct| produce_result(&my_struct)).collect::<Result<Vec<MyStruct>, Error>>;

the compiler complains: "a collection of type std::result::Result<std::vec::Vec<MyStruct>, Error> cannot be built from an iterator over elements of type std::vec::Vec<MyStruct>"

How to work around this case?

Evian
  • 1,035
  • 1
  • 9
  • 22
  • 1
    Does this answer your question? [How do I perform iterator computations over iterators of Results without collecting to a temporary vector?](https://stackoverflow.com/questions/48841367/how-do-i-perform-iterator-computations-over-iterators-of-results-without-collect) [How to do simple math with a list of numbers from a file and print out the result in Rust?](https://stackoverflow.com/questions/59243725/how-to-do-simple-math-with-a-list-of-numbers-from-a-file-and-print-out-the-resul) – Stargateur Jan 22 '20 at 07:57
  • 1
    The duplicate [results](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=d3f7eea56fbbc4d8b6f1093e8266f096) – Stargateur Jan 22 '20 at 08:06

3 Answers3

13

flat_map "flattens" the top-layer of the value returned from closure, by calling its IntoIterator implementation. It's important that it doesn't try to reach inside - i.e., if you had your own MyResult, it would error out on flat_map itself:

enum Error {}

enum MyResult<T, U> {
    Ok(T),
    Err(U),
}

struct MyStruct;

fn produce_result(item: &MyStruct) -> MyResult<Vec<MyStruct>, Error> {
    MyResult::Ok(vec![])
}

fn main() {
    let my_structs: Vec<MyStruct> = vec![];
    let res = my_structs
        .iter()
        .flat_map(|my_struct| produce_result(&my_struct))
        .collect::<Result<Vec<MyStruct>, Error>>();
}

(Playground)

Error:

error[E0277]: `MyResult<std::vec::Vec<MyStruct>, Error>` is not an iterator
  --> src/main.rs:18:10
   |
18 |         .flat_map(|my_struct| produce_result(&my_struct))
   |          ^^^^^^^^ `MyResult<std::vec::Vec<MyStruct>, Error>` is not an iterator
   |
   = help: the trait `std::iter::Iterator` is not implemented for `MyResult<std::vec::Vec<MyStruct>, Error>`
   = note: required because of the requirements on the impl of `std::iter::IntoIterator` for `MyResult<std::vec::Vec<MyStruct>, Error>`

In your case, however, the behaviour is different, since Result implements IntoIterator. This iterator yields Ok value unchanged and skips Err, so when flat_mapping the Result, you effectively ignore every error and only use the results of successful calls.

There is a way to fix it, although a but cumbersome. You should explicitly match on the Result, wrapping the Err case in the Vec and "distributing" the Ok case over the already-existing Vec, then let flat_map do its job:

let res = my_structs
    .iter()
    .map(|my_struct| produce_result(&my_struct))
    .flat_map(|result| match result {
        Ok(vec) => vec.into_iter().map(|item| Ok(item)).collect(),
        Err(er) => vec![Err(er)],
    })
    .collect::<Result<Vec<MyStruct>, Error>>();

Playground

There's also another way, which might be more performant if errors are indeed present (even if only sometimes):

fn external_collect(my_structs: Vec<MyStruct>) -> Result<Vec<MyStruct>, Error> {
    Ok(my_structs
        .iter()
        .map(|my_struct| produce_result(&my_struct))
        .collect::<Result<Vec<_>, _>>()?
        .into_iter()
        .flatten()
        .collect())
}

Playground

I've made some quick benchmarking - the code is on the playground, too, although it can't be run there due to the absence of cargo bench command, so I've runned them locally. Here are the results:

test vec_result::external_collect_end_error   ... bench:   2,759,002 ns/iter (+/- 1,035,039)
test vec_result::internal_collect_end_error   ... bench:   3,502,342 ns/iter (+/- 438,603)

test vec_result::external_collect_start_error ... bench:          21 ns/iter (+/- 6)
test vec_result::internal_collect_start_error ... bench:          30 ns/iter (+/- 19)

test vec_result::external_collect_no_error    ... bench:   7,799,498 ns/iter (+/- 815,785)
test vec_result::internal_collect_no_error    ... bench:   3,489,530 ns/iter (+/- 170,124)

It seems that the version with two chained collects takes double time of the method with nested collects if the execution is successful, but is substantionally (by one third, approximately) faster when execution does short-circuit on some error. This result is consistent over multiple benchmark runs, so the large variance reported probably doesn't really matter.

Cerberus
  • 8,879
  • 1
  • 25
  • 40
  • This way works! Thank you. But I also wonder that I could also get the expected result by `my_structs.iter().map(...).collect::, _>>()?.iter().flatten().collect`, since both way needs more than one `collect`, which one is more efficient? – Evian Jan 22 '20 at 04:44
  • Get it. Thank you! – Evian Jan 22 '20 at 05:29
  • "test vec_result::external_collect_start_error ... bench: 21 ns/iter (+/- 6) test vec_result::internal_collect_start_error ... bench: 30 ns/iter (+/- 19)" this is unlucky to be reel result, this obviously not run and optimized away by compiler. – Stargateur Jan 22 '20 at 08:07
  • @Stargateur Tried to move the error into second entry instead of first - got 61 ns and 62 ns correspondingly. Looks like compiler does short-circuiting at compile-time? – Cerberus Jan 22 '20 at 09:25
  • Thanks for this thorough answer! I think `.map(|item| Ok(item))` can be abbreviated to `.map(Ok)`. – n8henrie May 12 '21 at 22:10
5

Using itertools' flatten_ok method:

use itertools::Itertools;
use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    // without error
    let res = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
        .into_iter()
        .map(Ok::<_, &str>)
        .flatten_ok()
        .collect::<Result<Vec<_>, _>>()?;
    println!("{:?}", res);

    // with error
    let res = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
        .into_iter()
        .map(|_| Err::<[i32; 2], _>("errored out"))
        .flatten_ok()
        .collect::<Result<Vec<_>, _>>()?;
    println!("{:?}", res); // not printed

    Ok(())
}

playground link

Niki
  • 738
  • 8
  • 17
0

The solution with best performance is using Either. It is similar to Cerberus's Vec solution, except that Vec is replaced by Left and Right, as mentioned in another question, for better performance.


fn to_either(
    result: Result<Vec<MyStruct>, Error>,
) -> impl Iterator<Item = Result<MyStruct, Error>> {
    match result {
        Ok(vec) => Right(vec.into_iter().map(Ok)),
        Err(e) => Left(std::iter::once(Err(e))),
    }
}


fn either_collect(my_structs: &[MyStruct], err_value: u64) -> Result<Vec<MyStruct>, Error> {
    my_structs
            .iter()
            .map(|my_struct| produce_result(&my_struct, err_value))
            .flat_map(to_either)
            .collect::<Result<Vec<MyStruct>, Error>>()
}

Also Box can be used as mentioned in https://stackoverflow.com/a/29760740/955091 .

Below is an updated benchmark including all the solutions

#![feature(test)]
extern crate test;
use test::{black_box, Bencher};
use either::*;

struct Error;

struct MyStruct(u64);

fn produce_result(item: &MyStruct, err_value: u64) -> Result<Vec<MyStruct>, Error> {
    if item.0 == err_value {
        Err(Error)
    } else {
        Ok((0..item.0).map(MyStruct).collect())
    }
}


fn to_box_iterator(
    result: Result<Vec<MyStruct>, Error>,
) -> Box<dyn Iterator<Item = Result<MyStruct, Error>>> {
    match result {
        Ok(vec) => Box::new(vec.into_iter().map(Ok)),
        Err(e) => Box::new(std::iter::once(Err(e))),
    }
}


fn to_either(
    result: Result<Vec<MyStruct>, Error>,
) -> impl Iterator<Item = Result<MyStruct, Error>> {
    match result {
        Ok(vec) => Left(vec.into_iter().map(Ok)),
        Err(e) => Right(std::iter::once(Err(e))),
    }
}


fn either_collect(my_structs: &[MyStruct], err_value: u64) -> Result<Vec<MyStruct>, Error> {
    my_structs
            .iter()
            .map(|my_struct| produce_result(&my_struct, err_value))
            .flat_map(to_either)
            .collect::<Result<Vec<MyStruct>, Error>>()
}

fn box_collect(my_structs: &[MyStruct], err_value: u64) -> Result<Vec<MyStruct>, Error> {
    my_structs
            .iter()
            .map(|my_struct| produce_result(&my_struct, err_value))
            .flat_map(to_box_iterator)
            .collect::<Result<Vec<MyStruct>, Error>>()
}

fn internal_collect(my_structs: &[MyStruct], err_value: u64) -> Result<Vec<MyStruct>, Error> {
    my_structs
            .iter()
            .map(|my_struct| produce_result(&my_struct, err_value))
            .flat_map(|result| match result {
                Ok(vec) => vec.into_iter().map(|item| Ok(item)).collect(),
                Err(er) => vec![Err(er)],
            })
            .collect::<Result<Vec<MyStruct>, Error>>()
}

fn external_collect(my_structs: &[MyStruct], err_value: u64) -> Result<Vec<MyStruct>, Error> {
    Ok(my_structs
            .iter()
            .map(|my_struct| produce_result(&my_struct, err_value))
            .collect::<Result<Vec<_>, _>>()?
            .into_iter()
            .flatten()
            .collect())
}

#[bench]
pub fn internal_collect_start_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| internal_collect(&my_structs, 0));
}

#[bench]
pub fn box_collect_start_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| box_collect(&my_structs, 0));
}

#[bench]
pub fn either_collect_start_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| either_collect(&my_structs, 0));
}

#[bench]
pub fn external_collect_start_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| external_collect(&my_structs, 0));
}

#[bench]
pub fn internal_collect_end_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| internal_collect(&my_structs, 999));
}

#[bench]
pub fn box_collect_end_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| box_collect(&my_structs, 999));
}

#[bench]
pub fn either_collect_end_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| either_collect(&my_structs, 999));
}

#[bench]
pub fn external_collect_end_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| external_collect(&my_structs, 999));
}

#[bench]
pub fn internal_collect_no_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| internal_collect(&my_structs, 1000));
}

#[bench]
pub fn box_collect_no_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| box_collect(&my_structs, 1000));
}

#[bench]
pub fn either_collect_no_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| either_collect(&my_structs, 1000));
}

#[bench]
pub fn external_collect_no_error(b: &mut Bencher) {
    let my_structs: Vec<_> = black_box((0..1000).map(MyStruct).collect());
    b.iter(|| external_collect(&my_structs, 1000));
}

Result:

cargo bench
  Downloaded either v1.8.1
  Downloaded 1 crate (16.0 KB) in 0.24s
   Compiling either v1.8.1
   Compiling my-project v0.1.0 (/home/runner/DarksalmonLiquidPi)
    Finished bench [optimized] target(s) in 4.63s
     Running unittests src/main.rs (target/release/deps/my_project-03c6287d7b53ee50)

running 12 tests
test box_collect_end_error        ... bench:   9,439,159 ns/iter (+/- 7,009,221)
test box_collect_no_error         ... bench:   9,538,551 ns/iter (+/- 7,923,652)
test box_collect_start_error      ... bench:          81 ns/iter (+/- 178)
test either_collect_end_error     ... bench:   4,266,292 ns/iter (+/- 6,008,125)
test either_collect_no_error      ... bench:   3,341,910 ns/iter (+/- 6,344,290)
test either_collect_start_error   ... bench:          41 ns/iter (+/- 53)
test external_collect_end_error   ... bench:     209,960 ns/iter (+/- 663,883)
test external_collect_no_error    ... bench:  10,074,473 ns/iter (+/- 4,737,417)
test external_collect_start_error ... bench:          17 ns/iter (+/- 55)
test internal_collect_end_error   ... bench:   8,860,670 ns/iter (+/- 6,148,916)
test internal_collect_no_error    ... bench:   8,564,756 ns/iter (+/- 6,842,558)
test internal_collect_start_error ... bench:          44 ns/iter (+/- 165)

test result: ok. 0 passed; 0 failed; 0 ignored; 12 measured; 0 filtered out; finished in 69.52s

The benchmark can be found at https://replit.com/@Atry/DarksalmonLiquidPi

Community
  • 1
  • 1
Yang Bo
  • 3,586
  • 3
  • 22
  • 35