4

the following example illustrates what i am trying to do:

use rayon::prelude::*;

struct Parent {
    children: Vec<Child>,
}
struct Child {
    value: f64,
    index: usize,
    //will keep the distances to other children in the children vactor of parent
    distances: Vec<f64>,
}

impl Parent {
    fn calculate_distances(&mut self) {
        self.children
            .par_iter_mut()
            .for_each(|x| x.calculate_distances(&self.children));
    }
}

impl Child {
    fn calculate_distances(&mut self, children: &[Child]) {
        children
            .iter()
            .enumerate()
            .for_each(|(i, x)| self.distances[i] = (self.value - x.value).abs());
    }
}

The above won't compile. The problem is, that i can't access &self.children in the closure of the first for_each. I do understand, why the borrow checker does not allow that, so my question is, if there is a way to make it work with little changes. The solutions i found so far are not really satisfying. One solution would be to just clone children at the beginning of Parent::calculate distances and use that inside the closure(which adds an unnecessary clone). The other solution would be to extract the value field of Child like that:

use rayon::prelude::*;
struct Parent {
    children: Vec<Child>,
    values: Vec<f64>
}
struct Child {
    index: usize,
    //will keep the distances to other children in the children vactor of parent
    distances: Vec<f64>,
}

impl Parent {
    fn calculate_distances(&mut self) {
      let values = &self.values;
        self.children
            .par_iter_mut()
            .for_each(|x| x.calculate_distances(values));
    }
}

impl Child {
    fn calculate_distances(&mut self, values: &[f64]) {
       for i in 0..values.len(){
           self.distances[i]= (values[self.index]-values[i]).abs();
       }
    }
}

while this would be efficient it totally messes up my real code, and value conceptually just really belongs to Child. I am relatively new to rust, and just asked myself if there is any nice way of doing this. As far as i understand there would need to be a way to tell the compiler, that i only change the distances field in the parallel iterator, while the value field stays constant. Maybe this is a place to use unsafe? Anyways, i would really appreciate if you could hint me in the right direction, or at least confirm that my code really has to become so much messier to make it work:)

Foveng
  • 47
  • 5
  • 1
    Sounds like the typical use case for [`Cell`](https://doc.rust-lang.org/std/cell/struct.Cell.html) – Jmb Mar 16 '21 at 10:48

1 Answers1

3

Rust tries hard to prevent you from doing precisely what you want to do: retain access to the whole collection while modifying it. If you're unwilling to adjust the layout of your data to accommodate the borrow checker, you can use interior mutability to make Child::calculate_distances take &self rather than &mut self. Then your problem goes away because it's perfectly fine to hand out multiple shared references to self.children.

Ideally you'd use a RefCell because you don't access the same Child from multiple threads. But Rust doesn't allow that because, based on the signatures of the functions involved, you could do so, which would be a data race. Declaring distances: RefCell<Vec<f64>> makes Child no longer Sync, removing access to Vec<Child>::par_iter().

What you can do is use a Mutex. Although it feels initially wasteful, have in mind that each call to Child::calculate_distances() receives a different Child, so the mutex will always be uncontended and therefore cheap to lock (not involve a system call). And you'd lock it only once per Child::calculate_distances(), not on every access to the array. The code would look like this (playground):

use rayon::prelude::*;
use std::sync::Mutex;

struct Parent {
    children: Vec<Child>,
}
struct Child {
    value: f64,
    index: usize,
    //will keep the distances to other children in the children vactor of parent
    distances: Mutex<Vec<f64>>,
}

impl Parent {
    fn calculate_distances(&mut self) {
        self.children
            .par_iter()
            .for_each(|x| x.calculate_distances(&self.children));
    }
}

impl Child {
    fn calculate_distances(&self, children: &[Child]) {
        let mut distances = self.distances.lock().unwrap();
        children
            .iter()
            .enumerate()
            .for_each(|(i, x)| distances[i] = (self.value - x.value).abs());
    }
}

You can also try replacing std::sync::Mutex with parking_lot::Mutex which is smaller (only one byte of overhead, no allocation), faster, and doesn't require the unwrap() because it doesn't do lock poisoning.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • Apparently using mutex inside par_iter can lead to deadlocks. The issue is discussed here: https://github.com/rayon-rs/rayon/issues/592 My final solution was to extract the values into a Buffer, so i don't have to copy the entire Children, which reduces the size of the unnecessary copy in my real code a lot. – Foveng Mar 19 '21 at 10:28
  • 1
    @Foveng Have you actually experienced a deadlock in that code, or are you just being careful? Locking this mutex should never deadlock because it is completely uncontended and only serves to prove to the borrow checker that accessing `distances` is safe. The example in the issue relates to the more typical use of mutex where the threads access a _shared_ resource protected by the mutex. – user4815162342 Mar 19 '21 at 10:45
  • 1
    @Foveng Also, in the linked issue, there is a `par_iter()` inside `par_iter()`, where the inner `par_iter()` executes with the mutex locked. While I understand that this kind of situation can arise and should be warned against, it's not the only possible or even a typical use of a mutex. – user4815162342 Mar 19 '21 at 10:57
  • In my real code i experienced a deadlock, i didn't try it with the example code, though. I can't rule out if i overlooked some relevant difference between my real and the sample code. Maybe i'll try if the sample code has issues later. – Foveng Mar 19 '21 at 11:07
  • 1
    I tried the sample code and did not experience a deadlock. Here is the code i use: ```fn main() { let mut parent = Parent{ children: Vec::new() }; for i in 0..10000{ parent.children.push(Child{ value: i as f64/2.0, index: i, distances: Mutex::new(vec![0.0;10000]), }) } parent.calculate_distances(); println!("finished"); }``` – Foveng Mar 19 '21 at 11:28