As you might already know, Stream API uses a Spliterator
and ForkJoinPool
to perform parallel computations. A Spliterator
is used for traversing and partitioning sequences of elements, while a ForkJoinPool
framework recursively breaks the task into smaller independent sub-tasks until they are simple enough to be executed asynchronously.
As an example of how a parallel computation framework, such as the java.util.stream
package, would use Spliterator
and ForkJoinPool
in a parallel computation, here is one way to implement an associated parallel forEach
, that illustrates the primary idiom:
public static void main(String[] args) {
List<Integer> list = new SplittableRandom()
.ints(24, 0, 100)
.boxed().collect(Collectors.toList());
parallelEach(list, System.out::println);
}
static <T> void parallelEach(Collection<T> c, Consumer<T> action) {
Spliterator<T> s = c.spliterator();
long batchSize = s.estimateSize() / (ForkJoinPool.getCommonPoolParallelism() * 8);
new ParallelEach(null, s, action, batchSize).invoke(); // invoke the task
}
The Fork Join Task:
static class ParallelEach<T> extends CountedCompleter<Void> {
final Spliterator<T> spliterator;
final Consumer<T> action;
final long batchSize;
ParallelEach(ParallelEach<T> parent, Spliterator<T> spliterator,
Consumer<T> action, long batchSize) {
super(parent);
this.spliterator = spliterator;
this.action = action;
this.batchSize = batchSize;
}
// The main computation performed by this task
@Override
public void compute() {
Spliterator<T> sub;
while (spliterator.estimateSize() > batchSize &&
(sub = spliterator.trySplit()) != null) {
addToPendingCount(1);
new ParallelEach<>(this, sub, action, batchSize).fork();
}
spliterator.forEachRemaining(action);
propagateCompletion();
}
}
Original source.
Also, keep in mind that parallel computation may not always be faster than sequential one and you always have a choice - When to use parallel stream.