I think Holger's and Sotirios' answers are accurate, but inasmuch as I'm the guy who made the statement, I guess I should explain myself.
I'm mainly talking about spliterator characteristics, in particular the SIZED
characteristic. This is basically "static" information about the stream stages that is known at pipeline setup time, but before the stream actually executes. Indeed, it's used for determining the execution strategy for the stream, so it has to be known before the stream executes.
The limit()
operation creates a spliterator that wraps its upstream spliterator, so the limit
spliterator needs to determine what characteristics to return. Even if its upstream spliterator is SIZED
, it doesn't know the exact size, so it has to turn off the SIZED
characteristic.
So if you, the programmer, were to write:
IntStream.range(0, 100).limit(10)
you'd say of course that stream has exactly 10 elements. (And it will.) But the resulting spliterator is still not SIZED
. After all, the limit
operator doesn't know the difference between the above and this:
IntStream.range(0, 1).limit(10)
at least in terms of spliterator characteristics.
So that's why, even though there are times when it seems like it ought to, the limit
operator doesn't return a stream of known size. This in turn affects the splitting strategy, which impacts parallel efficiency.