Parallel initialisation of big memory block in Rust

Question

What is the best way to do this:

I have a 2GiB memory block that needs to be initialised with some data. Data are all independent, so I can easily spawn n threads to initialise the data in parallel with each thread working on a separate memory location.

How can I "tell" this to Rust. It won't allow me sharing the memory between threads (for good reason and rightfully so). I know there will be no race-condition because each thread works on totally separate memory locations.

One idea is to work with (crossbeam)channels. Each threads sends there computation to one writer that puts the memory into the right place. Alas this feels overly complicated and not efficient enough. Is there some way to partition the memory for threads to make it safe to be worked on with different threads?

Without details it's hard to give you a precise solution. But you should have a look at this example: https://rust-lang-nursery.github.io/rust-cookbook/concurrency/parallel.html — Denys Séguret, Apr 20 '21 at 04:46
Does this answer your question? [Simultaneous mutable access to arbitrary indices of a large vector that are guaranteed to be disjoint](https://stackoverflow.com/questions/55939552/simultaneous-mutable-access-to-arbitrary-indices-of-a-large-vector-that-are-guar) — Denys Séguret, Apr 20 '21 at 04:47

score 2 · Answer 1 · answered Apr 20 '21 at 05:27

The go-to crate for parallelism is rayon, which can make this easy:

use rayon::prelude::*;

// this just takes a big zeroed buffer and fills it with 1s with 10 threads
fn main() {
    let mut data = vec![0u8; 2000000]; // pretend this is 2 GiB
    data.chunks_mut(200000) // pretend this is 2 GiB / N threads
        .par_bridge()
        .for_each(|d| d.fill(1));
}

The chunks_mut() function for slices can already give you many mutable slices to separate regions of the original. This then just uses par_bridge() to convert it into a parallel iterator.

user4815162342 · Answer 2 · 2021-04-20T14:01:54.010

In addition to the excellent Rayon crate recommended by kmdreko, a more low-level primitive that achieves what you want is slice::split_at_mut. You can use it to split an existing mutable slice into multiple mutable slices without using unsafe code:

let mut data = vec![0u8; 2_000_000]; // pretend this is 2 GiB
let mut data = &mut data[..];
let chunk_size = data.len() / n_threads;

for _ in 0..n_threads {
    let (chunk, rest) = data.split_at_mut(chunk_size);
    data = rest;
    spawn(move |_| {
        chunk.fill(1);
    });
}

However, for this to work, you need scoped threads, i.e. threads that are allowed to refer to borrowed values. (That is sound because scoped threads automatically join all created threads, thus guaranteeing that no thread outlives a borrowed value.)

Here is an example, using a similar setup as in kmdreko's answer:

use crossbeam_utils::thread;

fn main() {
    let n_threads = 8;
    let mut data = vec![0u8; 2_000_000]; // pretend this is 2 GiB
    let mut data = &mut data[..];
    let chunk_size = data.len() / n_threads;
    thread::scope(|s| {
        for _ in 0..n_threads {
            let (chunk, rest) = data.split_at_mut(chunk_size);
            data = rest;
            s.spawn(move |_| {
                chunk.fill(1);
            });
        }
    })
    .unwrap();
}

Playground

Parallel initialisation of big memory block in Rust

2 Answers2