There is no definite statement possible, as you are developing against an interface and the specification does not mandate a particular sort algorithm.
For the generic Stream
, we have to assume a comparison sort whose worst case of O(n log n) can not be avoided by any algorithm.
But your example uses an IntStream
which, in principle, could use a counting sort or similar, having O(n). This doesn’t happen in practice with the reference implementation, as having a better worst case time complexity does not necessarily lead to a better performance in practice and the maximum number of elements is limited to the JVM’s maximum array size.
The time complexity of distinct()
is O(n), as it just checks whether adding into a HashSet
succeeds. Combining the O(n) with O(n log n) leads to an overall complexity of O(n log n). Maybe the interviewer got this combining of time complexities wrong.
But this is a good example demonstrating that time complexity is not the equal to performance. When you use sorted().distinct()
, the distinct()
operation will utilize the sorted nature of the incoming elements, which makes it unnecessary to build a HashSet
behind the scenes¹. Since the reference implementation has no primitive value set, this eliminates a lot of boxing overhead. On the other hand, using distinct().sorted()
could reduce the number of elements to sort, but it requires to have significantly less distinct elements than total stream elements to pay off.
Such performance differences are not covered by the time complexity, which still is the same for both approaches. But as said, for the streams of primitive types, different algorithms with a different time complexity would be possible.
But one thing, we can say for sure. When you split the operations into two stream operations, like requesting a result array from the first and passing it to Arrays.stream
again, there is no chance that the underlying implementation utilizes knowledge about the previous operation in the next operation.
Note that the statements above assume a terminal operation like your example’s toArray
that consumes all elements and requires the resulting encounter order to be maintained. With other, short-circuiting or unordered terminal operations, the overall time complexity could change, e.g. sorted().findFirst()
could get optimized to the equivalent of min()
or the sorting step could get eliminated for an unordered terminal operation, like sum()
. That does not happen in the current reference implementation, but, as said, you’re programming against an interface.
¹ for primitive streams, this only works in Java 9+. As said, there is no primitive specialization for distinct()
, it’s implemented like boxed().distinct().mapToInt(i -> i)
and in Java 8, boxed()
is implemented as mapToObj(Integer::valueOf)
which loses the information about the sorted input, as explained in the last section of this answer.