12

Is there any guarantee that operations on a sequential and ordered Stream are processed in encounter order?

I mean, if I have code like this:

IntStream.range(0, 5)
        .map(i -> {
            myFunction(i);
            return i * 2;
         })
         .boxed()
         .collect(toList());

is there a guarantee, that it will perform myFunction() calls in the encounter order of the generated range?

I found draft JavaDocs for the Stream class which explicitely states this:

For sequential stream pipelines, all operations are performed in the encounter order of the pipeline source, if the pipeline source has a defined encounter order.

but in the official JavaDocs this line was removed. It now discusses encounter order only for selected methods. The Package java.util.stream doc in the Side-effects paragraph states:

Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray() must produce [0, 2, 4, 6, 8]), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.

but it says nothing about sequential streams and the example is for a parallel stream (My understanding is that it's true for both sequential and parallel streams, but this is the part I'm not sure about).

On the other hand, it also states in the Ordering section:

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result.

but this time it start with "operating on the elements", but the example is about the resulting stream, so I'm not sure they are taking side-effects in account and side-effects is really what this question is about.

piotrek
  • 282
  • 1
  • 3
  • 14
  • 2
    Does this answer your question? http://stackoverflow.com/questions/36102961/what-is-the-order-in-which-stream-operations-are-applied-to-list-elements – Tunaki May 17 '16 at 13:54
  • I've expanded the question to include the ordering section, which sounds to me like it's contradicting the side-effect section. – piotrek May 17 '16 at 14:47
  • Maybe this one http://stackoverflow.com/questions/29216588/how-to-ensure-order-of-processing-in-java8-streams – Jean-Baptiste Yunès May 17 '16 at 14:52

1 Answers1

2

I think we can learn a lot from the fact that this explicit sentence has been removed. This question seems to be closely related to the question “Does Stream.forEach respect the encounter order of sequential streams?”. The answer from Brian Goetz basically says that despite the fact that there’s no scenario where the order is ignored by the Stream’s current implementation when forEach is invoked on a sequential Stream, forEach has the freedom to ignore the encounter order even for sequential Streams per specification.

Now consider the following section of Stream’s class documentation:

To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as filter(Predicate)), and a terminal operation (which produces a result or side-effect, such as count() or forEach(Consumer)). Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.

Since it is the terminal operation which determines whether elements are needed and whether they are needed in the encounter order, a terminal action’s freedom to ignore the encounter order also implies consuming, hence processing, the elements in an arbitrary order.

Note that not only forEach can do that. A Collector has the ability to report an UNORDERED characteristic, e.g. Collectors.toSet() does not depend on the encounter order. It’s obvious that also an operation like count() doesn’t depend on the order—in Java 9 it may even return without any element processing. Think of IntStream#sum() for another example.

In the past, the implementation was too eager in propagating an unordered characteristic up the stream, see “Is this a bug in Files.lines(), or am I misunderstanding something about parallel streams?” where the terminal operation affected the outcome of a skip step, which is the reason why the current implementation is reluctant about such optimizations to avoid similar bugs, but that doesn’t preclude the reappearance of such optimizations, then being implemented with more care…

So at the moment it’s hard to imagine how an implementation could ever gain a performance benefit from exploiting the freedom of unordered evaluations in a sequential Stream, but, as stated in the forEach-related question, that doesn’t imply any guarantees.

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765
  • 1
    @Hulk: Whether summing `int`s will overflow or not does *not* depend on the order of elements. The result of `a+b+c` is the same as `c+b+a` or `b+c+a`, etc. for all `int` values, even when an overflow occurs. That’s why I chose `IntStream#sum()` rather than `DoubleStream#sum()`, where the order indeed does make a difference (though `DoubleStream#sum()`’s documentation explicitly states that the order of additions will be undefined). – Holger May 18 '16 at 08:45
  • I came to the same conclusion after thinking about it for a bit - that's why I already deleted my comments. Sorry for any confusion caused ;) – Hulk May 18 '16 at 08:48
  • *So at the moment it’s hard to imagine how an implementation could ever gain a performance benefit from exploiting the freedom of unordered evaluations in a sequential Stream* - `.sorted().findAny()` could eliminate the sorting. If unorderedness could be propagated back to the source spliterator (it can't) then some data structure traversal could also be optimized . – the8472 Jan 13 '18 at 18:32
  • @the8472 well, eliminating sorting was the only optimization in that regard and it had been removed. But I never said there weren’t possibilities. – Holger Jan 15 '18 at 07:19