10

Suppose we have a standard method chain of stream operations:

Arrays.asList("a", "bc", "def").stream()
  .filter(e -> e.length() != 2)
  .map(e -> e.length())
  .forEach(e -> System.out.println(e));

Are there any guarantees in the JLS regarding the order in which stream operations are applied to the list elements?

For example, is it guaranteed that:

  1. Applying the filter predicate to "bc" is not going to happen before applying the filter predicate to "a"?
  2. Applying the mapping function to "def" is not going to happen before applying the mapping function to "a"?
  3. 1 will be printed before 3?

Note: I am talking here specifically about stream(), not parallelStream() where it is expected that operations like mapping and filtering are done in parallel.

Dragan Bozanovic
  • 23,102
  • 5
  • 43
  • 110
  • 2
    I think everything you want to know can be found [here](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#Ordering) – Yassin Hajaj Mar 19 '16 at 14:38
  • Are you are asking this question because you are curious, or because you would like to build something that relies on such ordering? – Sergey Kalinichenko Mar 19 '16 at 15:01
  • @dasblinkenlight I am trying to build something that relies on the ordering. More specifically, I want to build a custom predicate/function which will provide me with both index and the element of the list, so that I can use both as the filtering/mapping condition. I intend to increment the index whenever `accept`/`apply` is called. Something similar to [this question](http://stackoverflow.com/questions/18552005/is-there-a-concise-way-to-iterate-over-a-stream-with-indices-in-java-8). – Dragan Bozanovic Mar 19 '16 at 15:10
  • @DraganBozanovic So you are building something like [`zipWithIndex`](http://stackoverflow.com/a/23051268/335858), right? Then you do not care about `filter()` or `map()` much, only about `stream()`, because once the stream is filtered or mapped, the indexing is relative to filtered collection, not to the original one. – Sergey Kalinichenko Mar 19 '16 at 15:18
  • @dasblinkenlight True, and I wanted to be sure about the ordering guarantees. Of course, when it comes to filtering, then indexing is relative to filtered collection (it makes sense to use it only in the filter predicate), but if there is no filtering, then I can use it in all of the operations if the order is guaranteed to be maintained. – Dragan Bozanovic Mar 19 '16 at 15:22

5 Answers5

9

Everything you want to know can be found within the java.util.stream JavaDoc.

Ordering

Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not. Some intermediate operations, such as sorted(), may impose an encounter order on an otherwise unordered stream, and others may render an ordered stream unordered, such as BaseStream.unordered(). Further, some terminal operations may ignore encounter order, such as forEach().

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result.

For sequential streams, the presence or absence of an encounter order does not affect performance, only determinism. If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result; if it is not ordered, repeated execution might produce different results.

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant. Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations. However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.

Community
  • 1
  • 1
Yassin Hajaj
  • 21,337
  • 9
  • 51
  • 89
  • Thanks. I edited my question a bit. _"Further, some terminal operations may ignore encounter order, such as forEach()."_ Does it mean that this guarantee dos not apply to `forEach`? – Dragan Bozanovic Mar 19 '16 at 14:56
  • 2
    @DraganBozanovic There is `forEach` and `forEachOrdered`: the latter will keep encounter order. Refer to this question also http://stackoverflow.com/questions/29216588/how-to-ensure-order-of-processing-in-java8-streams – Tunaki Mar 19 '16 at 14:59
4

Are there any guarantees in the JLS regarding the order in which stream operations are applied to the list elements?

The Streams library is not covered by the JLS. You would need to read the Javadoc for the library.

Streams also support parallel stream and the order in which things are processed depends on the implementations.

Applying the filter predicate to "bc" is not going to happen before applying the filter predicate to "a"?

It would be reasonable to assume that it would, but you can't guarantee it, nor should you be writing code which requires this guarantee otherwise you wouldn't be able to parallelise it later.

applying the mapping function to "def" is not going to happen before applying the mapping function to "a"?

It is safe assume this does happen, but you shouldn't write code which requires it.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
1

There is no guarantee of the order in which list items are passed to predicate lambdas. Stream documentation makes guarantees regarding the output of streams, including the order of encounter; it does not make guarantees about implementation details, such as the order in which filter predicates are applied.

Therefore, the documentation does not prevent filter from, say, reading several elements, running the predicate on them in reverse order, and then sending the elements passing the predicate to the output of the stream in the order in which they came in. I don't know why filter() would do something like that, but doing so wouldn't break any guarantee made in the documentation.

You can make pretty strong inference from the documentation that filter() would call predicate on the elements in the order in which collection supplies them, because you are passing the result of calling stream() on a list, which calls Collection.stream(), and, according to Java documentation, guarantees that Stream<T> produced in this way is sequential:

Returns a sequential Stream with this collection as its source.

Further, filter() is stateless:

Stateless operations, such as filter and map, retain no state from previously seen element when processing a new element - each element can be processed independently of operations on other elements.

Therefore it is rather likely that filter would call the predicate on elements in the order they are supplied by the collection.

I am talking here specifically about stream(), not parallelStream()

Note that Stream<T> may be unordered without being parallel. For example, calling unordered() on a stream(), the result becomes unordered, but not parallel.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
0

If the stream is created from a list, it is guaranteed that the collected result will be ordered the same way the original list was, as the documentation states:

Ordering

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result.

To go further, there is no guarantee regarding the order of the execution of the map execution though.

From the same documention page (in the Side Effects paragraph):

Side-effects

If the behavioral parameters do have side-effects, unless explicitly stated, there are no guarantees as to the visibility of those side-effects to other threads, nor are there any guarantees that different operations on the "same" element within the same stream pipeline are executed in the same thread. Further, the ordering of those effects may be surprising. Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray() must produce [0, 2, 4, 6, 8]), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.

In practice, for an ordered sequential stream, chances are that the stream operations will be executed in order, but there is no guarantee.

assylias
  • 321,522
  • 82
  • 660
  • 783
Jean Logeart
  • 52,687
  • 11
  • 83
  • 118
0

Are there any guarantees in the JLS regarding the order in which stream operations are applied to the list elements?

Quoting from Ordering section in Stream javadocs

  • Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations.


Applying the filter predicate to "bc" is not going to happen before applying the filter predicate to "a"?

As quoted above, streams may or may not have a defined order. But in your example since it is a List, the same Ordering section in Stream javadocs goes on saying that

  • If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6].

    Applying the above statement to your example - I believe, the filter predicate would receive the elements in order defined in the List.


Or, applying the mapping function to "def" is not going to happen before applying the mapping function to "a"?

For this I would refer to the Stream operations section Stream operations in Streams, that says,

  • Stateless operations, such as filter and map, retain no state from previously seen element when processing a new element

    Since map() doesn't retain state, I believe, it is safe to assume "def" is not going to be processed before "a" in your example.


1 will be printed before 3?

Although it may be unlikely with sequential streams (like List) but not guaranteed as the Ordering section in Stream javadocs does indicate that

  • some terminal operations may ignore encounter order, such as forEach().