It’s a fundamental principle of parallel streams that the encounter order doesn’t have to match the processing order. This enables concurrent processing of items of sublists or subtrees while assembling a correctly ordered result, if necessary. This explicitly allows bulk processing and even makes it mandatory for the parallel processing of ordered streams.
This behavior is determined by the particular implementation of the Spliterator
’s trySplit
implementation. The specification says:
If this Spliterator is ORDERED
, the returned Spliterator must cover a strict prefix of the elements
…
API Note:
An ideal trySplit
method efficiently (without traversal) divides its elements exactly in half, allowing balanced parallel computation.
Why was this strategy fixed in the specification and not, e.g. an even/odd split?
Well, consider a simple use case. A list will be filtered and collected into a new list, thus the encounter order must be retained. With the prefix rule, it’s rather easy to implement. Split off a prefix, filter both chunks concurrently, afterwards, add the result of the prefix filtering to the new list, followed by adding the filtered suffix.
With an even odd strategy, that’s impossible. You may filter both parts concurrently, but afterwards, you don’t know how to join the results correctly unless you track each items position throughout the entire operation.
Even then, joining these geared items would be much more complicated than performing an addAll
per chunk.
You might have noticed that this all applies only, if you have an encounter order that might have to be retained. If your spliterator doesn’t report an ORDERED
characteristic, it is not required to return a prefix. Nevertheless, the default implementation you might have inherited by AbstractSpliterator
is designed to be compatible with ordered spliterators. Thus, if you want a different strategy, you have to implement the split operation yourself.
Or you use a different way of implementing an unordered stream, e.g.
Stream.generate(()->{
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
return Thread.currentThread().getName();
}).parallel().forEach(System.out::println);
might be closer to what you expected.