7

I'm trying to parallelize an algorithm I have. This is a sketch of how I would write it in C++:

void thread_func(std::vector<int>& results, int threadid) {
   results[threadid] = threadid;
}

std::vector<int> foo() {
  std::vector<int> results(4);

  for(int i = 0; i < 4; i++)
  {
     spawn_thread(thread_func, results, i);
  }

  join_threads();

  return results;
}

The point here is that each thread has a reference to a shared, mutable object that it does not own. It seems like this is difficult to do in Rust. Should I try to cobble it together in terms of (and I'm guessing here) Mutex, Cell and &mut, or is there a better pattern I should follow?

anjruu
  • 1,224
  • 10
  • 24
  • Duplicate of http://stackoverflow.com/questions/31644152/processing-vec-in-parallel-how-to-do-safely-or-without-using-unstable-features ? – Shepmaster Aug 27 '15 at 12:51
  • 2
    See also: https://crates.io/crates/scoped_threadpool –  Aug 27 '15 at 12:53
  • 1
    I think the rust book covers exactly that case here: https://doc.rust-lang.org/book/concurrency.html#safe-shared-mutable-state – val Aug 27 '15 at 13:00
  • 2
    See also my response for a similar question which uses pointers: http://stackoverflow.com/questions/31608015/parallel-computing-of-array-elements-in-rust/31609380#31609380 – eulerdisk Aug 27 '15 at 13:03

1 Answers1

7

The proper way is to use Arc<Mutex<...>> or, for example, Arc<RWLock<...>>. Arc is a shared ownership-based concurrency-safe pointer to immutable data, and Mutex/RWLock introduce synchronized internal mutability. Your code then would look like this:

use std::sync::{Arc, Mutex};
use std::thread;

fn thread_func(results: Arc<Mutex<Vec<i32>>>, thread_id: i32) {
    let mut results = results.lock().unwrap();
    results[thread_id as usize] = thread_id;
}

fn foo() -> Arc<Mutex<Vec<i32>>> {
    let results = Arc::new(Mutex::new(vec![0; 4]));

    let guards: Vec<_> = (0..4).map(|i| {
        let results = results.clone();
        thread::spawn(move || thread_func(results, i))
    }).collect();

    for guard in guards {
        guard.join();
    }

    results
}

This unfortunately requires you to return Arc<Mutex<Vec<i32>>> from the function because there is no way to "unwrap" the value. An alternative is to clone the vector before returning.

However, using a crate like scoped_threadpool (whose approach could only be recently made sound; something like it will probably make into the standard library instead of the now deprecated thread::scoped() function, which is unsafe) it can be done in a much nicer way:

extern crate scoped_threadpool;

use scoped_threadpool::Pool;

fn thread_func(result: &mut i32, thread_id: i32) {
    *result = thread_id;
}

fn foo() -> Vec<i32> {
    let results = vec![0; 4];
    let mut pool = Pool::new(4);

    pool.scoped(|scope| {
        for (i, e) in results.iter_mut().enumerate() {
            scope.execute(move || thread_func(e, i as i32));
        }
    });

    results
}

If your thread_func needs to access the whole vector, however, you can't get away without synchronization, so you would need a Mutex, and you would still get the unwrapping problem:

extern crate scoped_threadpool;

use std::sync::Mutex;

use scoped_threadpool::Pool;

fn thread_func(results: &Mutex<Vec<u32>>, thread_id: i32) {
    let mut results = results.lock().unwrap();
    result[thread_id as usize] = thread_id;
}

fn foo() -> Vec<i32> {
    let results = Mutex::new(vec![0; 4]);
    let mut pool = Pool::new(4);

    pool.scoped(|scope| {
        for i in 0..4 {
            scope.execute(move || thread_func(&results, i));
        }
    });

    results.lock().unwrap().clone()
}

But at least you don't need any Arcs here. Also execute() method is unsafe if you use stable compiler because it does not have a corresponding fix to make it safe. It is safe on all compiler versions greater than 1.4.0, according to its build script.

Vladimir Matveev
  • 120,085
  • 34
  • 287
  • 296
  • 1
    It'd be nice if someone added a link to a pull request into the Rust compiler which fixes the bug which prevents making `execute()` safe on the stable channel (see the last paragraph). – Vladimir Matveev Aug 27 '15 at 13:24
  • Could I do something like `Arc>>`, so that I didn't have to return an `Arc>`? – anjruu Aug 27 '15 at 15:50
  • 1
    No, you can't. That's the whole point of scope-based API actually. You can't move data with borrowed references inside it into `thread::spawn()` because it may be unsafe - you could forget to join with the spawned thread, and the references could become invalid. With scoped API the join is enforced statically, and therefore you can share a reference and do not need `Arc`s. – Vladimir Matveev Aug 27 '15 at 16:20