4

Ordering

Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not. Some intermediate operations, such as sorted(), may impose an encounter order on an otherwise unordered stream, and others may render an ordered stream unordered, such as BaseStream.unordered(). Further, some terminal operations may ignore encounter order, such as forEach().

  1. Are there any other types that does not have the encounter order property but HashSet?
  2. In case I'm not intersted in keeping the exsisting order or any sort of sorting, is it considered to be best practice by explicitly calling to unordered intermidiate operation on each stream that will be computed in parallel?
Stav Alfi
  • 13,139
  • 23
  • 99
  • 171

3 Answers3

6

Besides the HashSet and HashMap’s collection views, Stream.generate() will generate an unordered stream.

Needless to say, streams generated by a Random are unordered too. Also, Stream.empty() does not report to have an encounter order, but that has not much consequences…

If you know that you don’t need the Stream to maintain the encounter order, it’s a good practice to use unordered()—even if it doesn’t improve the performance, as with most operations in the current implementation, it doesn’t harm and will document that you don’t care for the order. That does not only apply to parallel streams, some operations, like distinct(), may benefit from unorderedness even in the sequential case.

In some cases, choosing the right terminal operation, like findAny() instead of findFirst() documents that intent more concise and will also have a higher impact on the performance, given the current implementation.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Hi thanks agian for answering me. I got 2 qeustions: 1. Why `findAny()` will be slower in any case then `findFirst()`? 2. It seems you have strong control on the little details to each topic I asked. Do you try to understand the source code of stream liberary or reading oracle documentation is enough? Thanks agian!! – Stav Alfi Jun 02 '17 at 18:54
  • 2
    `findAny()` is not slower, but *faster* than `findFirst()`, not in any case, but some, simply because there are less constraints on the implementation. This is documented, so reading the documentation is enough to learn that it might be the case, but it requires a bit of experience to know *when* and some knowledge about what’s going on behind the scenes (not necessarily in terms of reading the source code) to understand *why* this is the case. Answering questions on Stackoverflow about that topic for some years, helps too, as trying to solve these puzzles will raise the knowledge… – Holger Jun 06 '17 at 08:14
5

Encounter order is nothing but the order of the source. E.g. In ArrayList, the elements are sorted in insertion order so streaming over it will give you elements in that order only.

  1. Apart from HashSet, HashMap is also unordered.
  2. If you are only interested in collect operation and don't care about ordering then you don't need to worry about it. Just stream() will be fine. E.g. if you want to let's say calculate the sum then you would do something like this:

    List<Integer> list = Arrays.asList(1,2,3);
    int sum = list.stream().collect(Collectors.summingInt(e -> e));
    

    In this case, it doesn't matter in which order the elements are flowing into the stream.

Yahya
  • 13,349
  • 6
  • 30
  • 42
Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102
2

For your second question yes. If you don't care about those, unordered will help. I sort had the same question a while ago, here.

Now think about un-ordered. When you have a List and you multiply each element by two and collect that back to a List and you do that in parallel. Each merge of the intermediate List has to happen in such a way that the order is preserved in the result List. You might have computed the 4-th and the first intermediate result and now need to merge them. If you care about order, you can't merge them directly, since this will obviously break the order; so you need to compute the other intermediate results and merge them in the same order.

You might imagine this like traversing a List from left to right; from index zero to the last one.

On the other hand if you don't care about order, this merging can happened in any order whenever it's ready.You can even read any element in any order, since this is irrelevant.

findFirst and findAny work with the same idea in the background. Suppose you have a List with 8 elements, process it in parallel and need to return the first one only. You might have already processed 7 last elements already, but because you need the first, it does not matter - you still have to wait for the first to process. It gets obvious why findAny is better...

Eugene
  • 117,005
  • 15
  • 201
  • 306