0

I am writing a command-line application in Java 8. There's a part that involves some computation, and I believe it could benefit from running in parallel using multiple threads. However, I have not much experience in writing multi-threaded applications, so I hope you could steer me in the right direction how should I design the parallel part of my code.

For simplicity, let's pretend the method in question receives a relatively big array of longs, and it should return a Set containing only prime numbers:

public final static boolean checkIfNumberIsPrime(long number) {
  // algorithm implementation, not important here
  // ...
}

// a single-threaded version
public Set<Long> extractPrimeNumbers(long[] inputArray) {
  Set<Long> result = new HashSet<>();
  for (long number : inputArray) {
    if (checkIfNumberIsPrime(number)) {
      result.add(number);
    }
  }
  return result;
}

Now, I would like to refactor method extractPrimeNumbers() in such way that it would be executed by four threads in parallel, and when all of them are finished, return the result. Off the top of my head, I have the following questions:

  • Which approach would be more suitable for the task: ExecutorService or Fork/Join? (each element of inputArray[] is completely independent and they can be processed in any order whatsoever)
  • Assuming there are 1 million elements in inputArray[], should I "ask" thread #1 to process all indexes 0..249999, thread #2 - 250000..499999, thread #3 - 500000..749999 and thread #4 - 750000..999999? Or should I rather treat each element of inputArray[] as a separate task to be queued and then executed by an applicable worker thread?
  • If a prime number is detected, it should be added to `Set result, therefore it needs to be thread-safe (synchronized). So, perhaps it would be better if each thread maintained its own, local result-set, and only when it is finished, it would transfer its contents to the global result, in one go?
  • Is Spliterator of any use here? Should they be used to partition inputArray[] somehow?
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Leszek Pachura
  • 403
  • 4
  • 18

1 Answers1

2

Parallel stream

Use none of these. Parallel streams are going to be enough to deal with this problem much more straightforwardly than any of the alternatives you list.

return Arrays.parallelStream(inputArray)
  .filter(n -> checkIfNumberIsPrime(n))
  .boxed()
  .collect(Collectors.toSet());

For more info, see The Java™ Tutorials > Aggregate Operations > Parallelism.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • Could the filter use a method reference? `.filter( WhateverClass :: checkIfNumberIsPrime )` – Basil Bourque Jul 20 '22 at 15:18
  • Thanks! But how would I control how many threads are used to execute **parallelStream()**? As far as I understand, they should correspond to processor cores... – Leszek Pachura Jul 20 '22 at 16:27
  • @BasilBourque: Yes, but the OP didn't give enough information to determine the name of the class. – Louis Wasserman Jul 20 '22 at 17:05
  • @LeszekPachura: You can as described [here](https://stackoverflow.com/a/22269778/869736), but you shouldn't. `parallelStream` will figure out the right number of threads for you. – Louis Wasserman Jul 20 '22 at 17:07