8

I'm trying to get started with Rust threads. In my example (contrived but based on a real problem), I want to accept a read-only HashMap as an argument to a function and then supply it to a number of threads which each act read from partitions of it.

use std::{
    collections::HashMap,
    sync::{mpsc::channel, Arc},
    thread,
};

const THREADS: u32 = 10;

// Concurrently add the lengths of values.
pub fn concurrent_lens(inputs: &HashMap<u32, String>) -> usize {
    let inputs_arc = Arc::new(inputs);

    let (tx, rx) = channel();

    // Count length of all strings in parallel.
    // Each thread takes a partition of the data.
    for thread_i in 0..THREADS {
        let tx = tx.clone();
        let inputs_clone = inputs_arc.clone();

        thread::spawn(move || {
            for (i, content) in inputs_clone.iter() {
                // Only look at my partition's keys.
                if (i % THREADS) == thread_i {
                    // Something expensive with the string.
                    let expensive_operation_result = content.len();

                    tx.send(expensive_operation_result).unwrap();
                }
            }
        });
    }

    // Join and sum results.
    let mut result = 0;
    for len in rx.iter() {
        result += len;
    }

    result
}

However, the compiler says:

error[E0621]: explicit lifetime required in the type of `inputs`
  --> src/main.rs:21:9
   |
10 | pub fn concurrent_lens(inputs: &HashMap<u32, String>) -> usize {
   |                        ------ consider changing the type of `inputs` to `&'static std::collections::HashMap<u32, std::string::String>`
...
21 |         thread::spawn(move || {
   |         ^^^^^^^^^^^^^ lifetime `'static` required

My options are, as I understand:

  • Make inputs static. This isn't possible, as it's not static data.
  • Let the function take ownership of input (not take a ref). So my function would be pub fn concurrent_lens(inputs: HashMap<u32, String>) -> usize. This makes the compiler happy about its lifetime, but the data lives outside the function, and has a longer lifetime outside.
  • Ditto, but pass in a copy. Not ideal, it's a lot of data.
  • Let the function take an Arc as an argument, i.e. pub fn concurrent_lens(inputs: Arc<HashMap<u32, String>>) -> usize. This works fine, but seems like a really leaky abstraction, as the calling code shouldn't have to know that it's calling a function that uses concurrency.

None of these seems quite right. Am I missing something?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Joe
  • 46,419
  • 33
  • 155
  • 245
  • I believe your question is answered by the answers of [How can I pass a reference to a stack variable to a thread?](https://stackoverflow.com/q/32750829/155423). If you disagree, please [edit] your question to explain the differences. Otherwise, we can mark this question as already answered. – Shepmaster Aug 11 '18 at 22:55
  • Thanks, I did read quite a lot of questions first. Not sure if that one was included. The principal difference I thought I had was that the argument is supplied as a reference. – Joe Aug 11 '18 at 22:57
  • I don't follow your comment. *that the argument is supplied as a reference*, but the proposed duplicate says *pass a reference [...] to a thread*. Can you further explain what difference you see? – Shepmaster Aug 11 '18 at 22:59
  • Ah, yes. I was hoping to understand how do this using the standard library, not extra crates, as this seems like the simplest possible useful thing I could use the library for. – Joe Aug 11 '18 at 23:00
  • Per question 32750829, `wrapper` is passed into `run_thread`, not passed as a reference (my second bullet point). – Joe Aug 11 '18 at 23:01
  • *how do this using the standard library, not extra crates* — your question has nothing about this restriction. If someone had spent time writing up a great answer that told you to use a crate, they'd be pretty upset if you said "oh, this doesn't answer it because of something I didn't tell you beforehand". – Shepmaster Aug 11 '18 at 23:05
  • 1
    *`wrapper` is passed into `run_thread`, not passed as a reference* — The proposed duplicate's point is that `wrapper` is `Wrapper` where *"`T` is a reference to a big object I don't want copied"*. Something being a reference versus containing a reference does not change the fundamental calculus of this problem. – Shepmaster Aug 11 '18 at 23:07
  • 1
    If we're talking about feelings, then any approach to language learning is a question of adjusting scope and error bars in comparison to other languages (if you take all assumptions off the table you have to go back to electrons). Based on past experience, my assumption was that this task seemed to be on the simple end of the spectrum wrt use of a built-in library. It seems that this isn't the case, and those crates are the way to go. I don't have a hard objection (or any objection) to using community crates, it was just a surprise. – Joe Aug 11 '18 at 23:10
  • 1
    That said 32750829 does seem to be a duplicate. Thank you for your time. – Joe Aug 11 '18 at 23:16

0 Answers0