0

I need to generate huge random vectors of f32s multiple times in my program, so I am looking for ways to make it parallel and efficient. I have made some progress with using Rayon's into_par_iter but I haven't found a way around having to initialize a new rng variable during the mapping.

Here is what I have currently:

    let r_dist = Uniform::new(0., 10.);

    let rand_vec: Vec<f32> = (1..biiiig_u64)
        .into_par_iter()
        .map(|_| {
            let mut rng = rand::thread_rng();
            rng.sample(r_dist)})
        .collect();

Of course this is making full use of all cpu cores, but I feel like initializing the new mut rng inside the mapping function is inefficient (I am new so I might be wrong). Is it possible to initialize an rng outside the iterator and use it non-unsafe-ly? Thanks.

AmanKP
  • 95
  • 8
  • 1
    You probably want thread-local initialization of rng, so that each thread has its own version of rng, and no synchronization is needed. Look around [this topic](https://stackoverflow.com/q/42647900/8564999) – Alexey S. Larionov May 18 '23 at 07:50
  • 3
    Rayon uses threadpool under the hood, AFAIK. And thread_rng() returns thread-local generator. This should be very efficient as it is. – freakish May 18 '23 at 07:50
  • 1
    Be aware that generating tons of random numbers usually leads to the starvation of the entropy pool of your OS, depending on the implementation. So `/dev/urandom` on Linux, for example, won't function properly. Make sure you know what you're doing as this can endanger the security of what you're doing if this is related to a security application. – The Quantum Physicist May 18 '23 at 11:07
  • I am simply working with large graphs, so I just need random node positions, but thank you for the information. I wasn't aware that rng decayed in quality with large sample sizes. Security isn't an issue for my current use case but I will keep this in mind. – AmanKP May 18 '23 at 11:16

1 Answers1

2

thread_rng is designed specifially for using effeciently in multiple threads. From docs:

Retrieve the lazily-initialized thread-local random number generator, seeded by the system.

So it is created once per thread and stored in thread local variable. It should be quite fast already.

However, rayon have a method exactly for your use-case: map_init.

    let r_dist = Uniform::new(0., 10.);

    let rand_vec: Vec<f32> = (1..biiiig_u64)
        .into_par_iter()
        .map_init(rand::thread_rng, |rng, _| rng.sample(r_dist))
        .collect();