Given that both methods take the vector by value, then the first version (with into_iter()
) is better because it does not make copies of the strings.
When a string matches the condition, it is moved into the destination vector. While with the second version (with iter()
), when a string matches the predicate, it is being copied and the copy is moved in the destination vector.This results in more allocations and more deallocations.
Here is simple criterion benchmark to demonstrate that:
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion}; //0.3.5
criterion_group!(benches, benchmark);
criterion_main!(benches);
pub fn benchmark(c: &mut Criterion) {
let input = vec![
"some string".to_owned(),
"some string".to_owned(),
"some string".to_owned(),
"some string".to_owned(),
"some string".to_owned(),
"some string".to_owned(),
"some string".to_owned(),
];
c.bench_with_input(BenchmarkId::new("with_iter", ""), &input, |b, i| {
b.iter(|| {
let result = with_iter(i.clone());
black_box(result);
});
});
c.bench_with_input(BenchmarkId::new("with_into_iter", ""), &input, |b, i| {
b.iter(|| {
let result = with_into_iter(i.clone());
black_box(result);
});
});
}
fn with_into_iter(input_array: Vec<String>) -> Vec<String> {
let max_len = input_array.iter().map(|string| string.len()).max().unwrap();
input_array
.into_iter()
.filter(|string| string.len() == max_len)
.collect()
}
fn with_iter(input_array: Vec<String>) -> Vec<String> {
let max_len = input_array.iter().map(|string| string.len()).max().unwrap();
input_array
.iter()
.filter(|string| string.len() == max_len)
.map(|s| s.to_string())
.collect()
}
This is obviously the extreme case when all the values will be copied. Depending on your input if you have very few short strings to copy, the difference might be negligible.
with_iter/ time: [392.30 ns 394.80 ns 397.93 ns]
with_into_iter/ time: [153.07 ns 153.41 ns 153.79 ns]
Obviously the timings depend on the number of strings to copy and their length, so the actual times depend on the function's input.
PS: And if you want to be as fast as possible, you can use retain()
:
fn with_retain(mut input_array: Vec<String>) -> Vec<String> {
let max_len = input_array.iter().map(|string| string.len()).max().unwrap();
input_array.retain(|string| string.len() == max_len);
input_array
}
with_retain/ time: [143.98 ns 144.22 ns 144.50 ns]
It is faster because:
- it does not allocate a second "result" vector
- if there are a lot of matches, then the result vector would not need to be resized (because the output length will always be <= the input length), thus avoiding additional allocations/deallocations & copying
The downside is that if there are very few matches on long input arrays, then the output vector will be too big, thus wasting a bit of memory.