1

I am running benchmarks with Criterion but am facing issues with functions that have an input that does not implement the Copy trait.

For example, I have set up the following benchmark for a function with the signature pub fn hash(vector: Vec<&str>) -> u64.

pub fn criterion_benchmark(c: &mut Criterion) {
    let s: String = String::from("Hello World!");
    let tokens: Vec<&str> = hashing::tokenize(&s);
    c.bench_function(
        "hash",
        |b| b.iter(|| {
            hashing::hash(tokens)
        }),
    );
}

However, unlike with types that have the Copy trait, the compiler throws out the following ownership error.

error[E0507]: cannot move out of `tokens`, a captured variable in an `FnMut` closure
  --> benches/benchmark.rs:17:34
   |
13 |     let tokens: Vec<&str> = hashing::tokenize(&s);
   |         ------ captured outer variable
...
17 |             hashing::hash(tokens)
   |                                  ^^^^^^ move occurs because `tokens` has type `Vec<&str>`, which does not implement the `Copy` trait

error[E0507]: cannot move out of `tokens`, a captured variable in an `FnMut` closure
  --> benches/benchmark.rs:16:20
   |
13 |     let tokens: Vec<&str> = hashing::tokenize(&s);
   |         ------ captured outer variable
...
16 |         |b| b.iter(|| {
   |                    ^^ move out of `tokens` occurs here
17 |             hashing::hash(tokens)
   |                                  ------
   |                                  |
   |                                  move occurs because `tokens` has type `Vec<&str>`, which does not implement the `Copy` trait
   |                                  move occurs due to use in closure

How can non-copyable inputs be passed to the benchmarked function without running into ownership issues?

bergwald
  • 88
  • 1
  • 7
  • 2
    The real problem seems to be in the benchmarked hashing function. Can't you just fix it to accept `&[&str]` (and call it with `hashing::hash(&tokens)`) ? Why would hashing need a vec ? – Denys Séguret Oct 12 '21 at 12:05
  • 2
    why not clone it ? – Stargateur Oct 12 '21 at 12:07
  • @Stargateur Cloning the parameter solved the issue. Thanks! – bergwald Oct 12 '21 at 12:14
  • @Stargateur Could the overhead of cloning the vector affect the benchmark? – bergwald Oct 12 '21 at 12:23
  • @bergwald Yes, it will. Criterion has tools to compute it separately, though. But your current prototype means every use of the hashing function will pay this price too. Hence my suggestion above. – Denys Séguret Oct 12 '21 at 12:24
  • @bergwald yes, of course, since you use criterion you could consider using https://docs.rs/criterion/0.3.5/criterion/struct.Bencher.html#method.iter_batched that should reduce the impact of cloning overhead – Stargateur Oct 12 '21 at 12:27
  • 1
    @Stargateur @DenysSéguret Results: Cloning the function has a ~50% overhead compared to accepting `&[&str]`. – bergwald Oct 12 '21 at 13:53

2 Answers2

1

Cloning the input can result in severe errors in the benchmark result.

Therefore you should use iter_batched() instead of iter()

use criterion::{black_box, criterion_group, criterion_main, BatchSize, BenchmarkId, Criterion};
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};

criterion_main!(benches);
criterion_group!(benches, criterion_benchmark);

fn criterion_benchmark(c: &mut Criterion) {
    let input_data: String = String::from("Hello World!");

    c.bench_function("bench_function", |bencher| {
        bencher.iter_batched(
            || init_data(&input_data),
            |input| {
                let x = benchmark_me(input);
                black_box(x);
            },
            BatchSize::SmallInput,
        );
    });

    c.bench_function("bench_function+clone", |bencher| {
        bencher.iter(|| {
            let x = benchmark_me(init_data(&input_data));
            black_box(x);
        });
    });
}

fn init_data(s: &str) -> Vec<&str> {
    // it's intentionally slower than a plain copy, to make the difference more visible!
    s.split_ascii_whitespace().collect()
}

fn benchmark_me(s: Vec<&str>) -> u64 {
    let mut hasher = DefaultHasher::new();
    s.hash(&mut hasher);
    hasher.finish()
}

Results:

bench_function                  time:   [99.520 ns 100.90 ns 102.23 ns]                           
bench_function+clone            time:   [210.41 ns 212.08 ns 213.77 ns]                                                                                                     ```
Svetlin Zarev
  • 14,713
  • 4
  • 53
  • 82
0

As suggested by @Stargateur, cloning the parameter solved the ownership issue.

pub fn criterion_benchmark(c: &mut Criterion) {
    let s: String = String::from("Hello World!");
    let tokens: Vec<&str> = hashing::tokenize(&s);
    c.bench_function(
        "hash",
        |b| b.iter(|| {
            hashing::hash(tokens.clone())
        }),
    );
}

However, as proposed by @DenysSéguret and @Masklinn, changing the hash function to accept &[&str] avoids the ~50% overhead of cloning the vector.

bergwald
  • 88
  • 1
  • 7
  • Alternatively if you can update `hash` to take a slice, you could `move` the tokens in the closure and simply lend them to the hash function. – Masklinn Oct 12 '21 at 12:40