On page 465 of Programming Rust you can find the code and explanation (emphasis added by me)
use std::sync::Arc; fn process_files_in_parallel(filenames: Vec<String>, glossary: Arc<GigabyteMap>) -> io::Result<()> { ... for worklist in worklists { // This call to .clone() only clones the Arc and bumps the // reference count. It does not clone the GigabyteMap. let glossary_for_child = glossary.clone(); thread_handles.push( spawn(move || process_files(worklist, &glossary_for_child)) ); } ... }
We have changed the type of glossary: to run the analysis in parallel, the caller must pass in an
Arc<GigabyteMap>
, a smart pointer to aGigabyteMap
that’s been moved into the heap, by doingArc::new(giga_map)
. When we call glossary.clone(), we are making a copy of theArc
smart pointer, not the wholeGigabyteMap
. This amounts to incrementing a reference count. With this change, the program compiles and runs, because it no longer depends on reference lifetimes. As long as any thread owns anArc<GigabyteMap>
, it will keep the map alive, even if the parent thread bails out early. There won’t be any data races, because data in anArc
is immutable.
In the next section they show this rewritten with Rayon,
extern crate rayon; use rayon::prelude::*; fn process_files_in_parallel(filenames: Vec<String>, glossary: &GigabyteMap) -> io::Result<()> { filenames.par_iter() .map(|filename| process_file(filename, glossary)) .reduce_with(|r1, r2| { if r1.is_err() { r1 } else { r2 } }) .unwrap_or(Ok(())) }
You can see in the section rewritten to use Rayon that it accepts &GigabyteMap
rather than Arc<GigabyteMap>
. They don't explain how this works though. Why doesn't Rayon require Arc<GigabyteMap>
? How does Rayon get away with accepting a direct reference?