There are actually two factors that drive the predictability of using reduce
along with the parallel
streams, these are:
Associativity of the accumulator
The binary operator provided to accumulate the stream as mentioned by others as well should be associative to produce predictable results. This also means that how the elements are grouped while performing the operation doesn't really matter.
(((a b) c) d) = ((a b) (c d))
e.g. A binary operation which is not associative when performed during reduction produces different results:
List<String> strings = List.of("An", "example", "of", "a", "binary", "operator");
System.out.println(strings.stream().reduce("", (s, str) -> String.valueOf(s.equals(str))));
System.out.println(strings.stream().parallel().reduce("", (s, str) -> String.valueOf(s.equals(str))));
The identity value
If the value provided as an identity element is actually not an identity, the results can still remain unpredictable. Hence it must imply
identity x = x
x identity = x
e.g. An element " " (space) which is not an identity element for a string concatenation would result differently when using parallel streams:
List<String> identityCheck = List.of("An", "example", "of", "a", "identity", "element");
System.out.println(identityCheck.stream().reduce(" ", String::concat));
System.out.println(identityCheck.stream().parallel().reduce(" ", String::concat));
Not only does the parallel execution here result in unexpected results but different executions on the same data set might also produce different results.