Reading a vector from multiple threads

Question

I have a function that returns a vector of strings, which is read by multiple threads later. How to do this in rust?

  fn get_list() -> Vec<String> { ... }
  fn read_vec() { 
       let v = get_list();
       for i in 1..10 {
           handles.push(thread::spawn (|| { do_work(&v); }));
       }
       handles.join();
  }

I think I need to extend the lifetime of v to static and pass it as a immutable ref to threads. But, I am not sure , how?

The answer is "use an Arc". It will guarantee that a) the vec will be kept alive until all threads finish b) that it will be cleaned up afterwards — Ivan C, Jan 04 '23 at 01:08

score 1 · Accepted Answer · answered Jan 04 '23 at 01:25

1

The problem you are facing is that the threads spawned by thread::spawn run for an unknown amount of time. You'll need to make sure that your Vec<String> outlives these threads.

You can use atomic reference-counting by creating an Arc<Vec<String>>, and create a clone for each thread. The Vec<String> will be deallocated only when all Arcs are dropped. Docs
You can leak the Vec<String>. I personally like this approach, but only if you need the Vec<String> for the entire runtime of your program. To achieve this, you can turn your Vec<String> into a &'static [String] by using Vec::leak. Docs
You can ensure that your threads will not run after the read_vec function returns - This is what you're essentially doing by calling handles.join(). However, the compiler doesn't see that these threads are joined later, and there might be edge cases where they are not joined (what happens when the 2nd thread::spawn panics?). To make this explicit, use the scope function in std::thread. Docs
Of course, you can also just clone the Vec<String>, and give each thread a unique copy.

TL;DR:
For this particular use-case, I'd recommend std::thread::scope. If the Vec<String> lives for the entire duration of your program, leaking it using Vec::leak is a great and often under-used solution. For more complex scenarios, wrapping the Vec<String> in an Arc is probably the right way to go.

answered Jan 04 '23 at 01:25

NyxCode

584
1
4
13

Thank you. well explained. Why I have to clone an Arc? Shouldn't a copy of Arc increase the refcount automatically? – sanjivgupta Jan 04 '23 at 02:36
1

@sanjivgupta I used `clone` as in [`std::clone::Clone`](https://doc.rust-lang.org/std/clone/trait.Clone.html). In Rust, you use `std::marker::Copy` to copy a value bit-by-bit, while `std::clone::Clone::clone` may perform more stuff than just a bitwise copy - for example, incrementing the refcount, like in `Arc`. – NyxCode Jan 04 '23 at 02:41
As you can see in the [docs](https://doc.rust-lang.org/std/sync/struct.Arc.html), `Arc` implements `Clone`, but it's not `Copy`. – NyxCode Jan 04 '23 at 02:42
Thank you again. The thing I needed to know is "A copy in rust is bitwise copy, while clone can do the right stuff". I come from a c++ background where the copy of shared_ptr increments the refcount. – sanjivgupta Jan 04 '23 at 02:44
@sanjivgupta Ah, I see. Yeah, the memory semantics are different from c++. Assigning a value (`let x = y`) or passing it by value is a move, like `std::move` in c++. `Copy` in Rust changes that, and the old value stays valid. `Clone`, on the other hand, is just a normal method, used in Rust where you'd use a copy constructor in c++. – NyxCode Jan 04 '23 at 02:47
This great talk goes into this difference, if you want to understand this more deeply: https://www.youtube.com/watch?v=IPmRDS0OSxM – NyxCode Jan 04 '23 at 02:48

Reading a vector from multiple threads

1 Answers1