This is indeed not obvious because the actual reason is rather historical. The first release version of Java 8 had a Stream implementation which did consider the unordered nature of findAny
, at least in some cases.
Not in the case of a skippable sorted
but in more cases than correct. This has been discussed in
In the latter Q&A we find a comment from Brian Goetz, one of the architects:
After some analysis we chose to back out back-propagation entirely. The only place where this pays off is optimizing away a sort; if you have a pipeline that sorts into an unordered terminal op, that's probably a user error anyway.
So the current state of the reference implementation is that the unordered nature of the terminal operation is not used for optimizing preceding intermediate operations.
This doesn’t imply that such optimization wouldn’t be valid; the example of skipping a sort
when the subsequent operations do not care about the order is explicitly mentioned as valid, though Brian Goetz suggests that such a scenario might be a mistake on the developer’s side anyway.
I have a slightly different point of view, as I think it would be a useful scenario if you have Stream returning method, which may return a sorted Stream, while operations chained by the caller determine whether this sorting is relevant for the final result or not.
But whether such an optimization happens or not does not affect the correctness of the result. Note that this does not only apply to unordered operations.
When you have
Optional<Integer> res = IntStream.rangeClosed(1, 10)
.mapToObj(x -> {
System.out.println("first map: " + x);
return 2*x;
})
.sorted((a,b)-> {
System.out.println("sorting " + a + " : " + b);
return Integer.compare(a, b);
})
.map(x -> {
System.out.println("second map: " + x);
return 2*x;
})
.findFirst();
it would be perfectly valid to transform the operation to (the equivalent of)
Optional<Integer> res = IntStream.rangeClosed(1, 10)
.mapToObj(x -> {
System.out.println("first map: " + x);
return 2*x;
})
.min((a,b)-> {
System.out.println("sorting " + a + " : " + b);
return Integer.compare(a, b);
})
.map(x -> {
System.out.println("second map: " + x);
return 2*x;
});
which would be more efficient and still return the correct result for the terminal operation while the evaluation order of the first mapping function and the comparator is different.
But the correct result is all that matters. You should never assume a specific evaluation order at all. You don’t find a specification of the evaluation order in the manual, because it has been left out intentionally. The presence or absence of optimizations can change at any time. If you want a dramatic example, replace your findAny()
with count()
and compare your example’s behavior in Java 8 and Java 9 (or newer).
// run with Java 8 and Java 9+ and compare...
long count = IntStream.rangeClosed(1, 10)
.mapToObj(x -> {
System.out.println("first map: " + x);
return 2*x;
})
.sorted((a,b)-> {
System.out.println("sorting " + a + " : " + b);
return Integer.compare(a, b);
})
.map(x -> {
System.out.println("second map: " + x);
return 2*x;
})
.count();
System.out.println("found " + count + " elements");