What is a faster way to perform element-wise summation of different length vectors?

Question

I'm trying to find a faster way to sum hundreds of these structs, each with a different length:

pub struct StereoWaveform {
    pub l_buffer: Vec<f64>,
    pub r_buffer: Vec<f64>,
}

I'm currently doing it like this:

fn sum_all_waveforms(vec_wav: Vec<StereoWaveform>) -> StereoWaveform {
    let mut result = StereoWaveform::new(0);
    for wav in vec_wav {
        result.l_buffer = sum_vec(&result.l_buffer, wav.l_buffer);
        result.r_buffer = sum_vec(&result.r_buffer, wav.r_buffer)
    }

    result
}

fn sum_vec(a: &Vec<f64>, b: Vec<f64>) -> Vec<f64> {
    let vec_len = std::cmp::max(a.len(), b.len());
    let mut acc: Vec<f64> = vec![0.0; vec_len];
    for (i, e) in a.iter().zip_longest(&b).enumerate() {
        match e {
            itertools::EitherOrBoth::Both(v1, v2) => acc[i] = v1 + v2,
            itertools::EitherOrBoth::Left(e) => acc[i] = *e,
            itertools::EitherOrBoth::Right(e) => acc[i] = *e,
        }
    }

    acc
}

I'm already using Rayon in the project, so it would be nice to find a solution using that.

[Why is it discouraged to accept a reference to a String (&String), Vec (&Vec) or Box (&Box) as a function argument?](https://stackoverflow.com/q/40006219/155423) — Shepmaster, Dec 17 '18 at 21:38
Please review how to create a [MCVE] and then [edit] your question to include it. We cannot tell what crates, types, traits, fields, etc. are present in the code. For example: ``no function or associated item named `new` found for type``. We don't know what version of Itertools you are using. Try to produce something that reproduces your error on the [Rust Playground](https://play.rust-lang.org) or you can reproduce it in a brand new Cargo project. There are [Rust-specific MCVE tips](//stackoverflow.com/tags/rust/info) as well. — Shepmaster, Dec 17 '18 at 21:40
You are asking for performance improvements, but you haven't provided any code that measures the performance or shown what your testing methodology is. You state "hundreds of structs" but don't provide a way to construct these values, forcing any answerer to do *a lot* of work to even attempt to solve the problem. You haven't provided your environment information (CPU/memory, operating system, Rust version, etc.) or what restrictions / requirements the program has (will it always run on a particular CPU, etc.). — Shepmaster, Dec 17 '18 at 21:43

score 1 · Accepted Answer · answered Dec 18 '18 at 02:38

sum_vec would be much faster if you extracted the branch out of the loop and stopped making new vectors.

fn sum_vec(a: &mut Vec<f64>, b: &[f64]) {
    if a.len() < b.len() {
        a.resize(b.len(), 0.0);
    }

    for (ai, bi) in a.iter_mut().zip(b) {
        *ai += *bi;
    }
}

or somesuch

Extracting the resize such that you only ever resize once to the maximum length would be even faster, and sorting vec_wav by length (longest first) should improve branch prediction and cache locality.

I'll improve the question, but in the short-term, this did improve performance by 2x so I've accepted the answer. Thank you Veedrac. — Danny Meyer, Dec 18 '18 at 21:08

What is a faster way to perform element-wise summation of different length vectors?

1 Answers1