4

As far as I'm aware, in parallel streams, methods such findFirst, skip, limit and etc. keep their behaviour as long as stream is ordered (which is by default) whether is't parallel or not. So I was wondering why forEach method is different. I gave it some thought, but I just could not understand the neccessity of defining forEachOrdered method, when it could have been more easier and less surprising to make forEach ordered by default, then call unordered on stream instance and that's it, no need to define new method.

Unfortunately my practical experience with Java 8 is quite limited at this point, so I would really appreciate if someone could explain me reasons for this architectural decision, maybe with some simple examples/use-cases to show me what could go wrong otherwise.

Just to make it clear, I'm not asking about this: forEach vs forEachOrdered in Java 8 Stream. I'm perfectly aware how those methods work and differences between them. What I'm asking about is practical reasons for architectural decision made by Oracle.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • *as doing so would sacrifice the benefit of parallelism.*... – Naman Sep 14 '18 at 05:09
  • And how it "skip" and "limit" having "ordered by default" behaviour is not sacrificing the benefit of parallelism? Some may say they just followed POLA with those methods, but frankly speaking having an exception (which is always bad) just for "forEach" is subjectively more astonishing. – Gaponenko Andrei Sep 14 '18 at 05:29
  • 1
    What is the reason for you to expect `forEach` to apply the action in any particular order? It’s not implied by the term “for each …” – Holger Sep 14 '18 at 06:46
  • 1
    I changed the title, removed “by default” and “in parallel streams”, as `forEach` is unordered *in general*, it’s not a default that could somehow changed and even if sequential streams may not exploit this property today, it’s defined as unordered operation. – Holger Sep 14 '18 at 07:23
  • "pratical reasons"- easier to implement if you have less constraints to abide to... and "decision made by Oracle" then probably (hopefully) Oracle knows better – user85421 Sep 14 '18 at 08:05

3 Answers3

3

When you process elements of a Stream in parallel, you simply should not expect any guarantees on order.

The whole idea is that multiple threads work on different elements of that stream. They progress individually, therefore the order of processing is not predictable. It is indeterministic, aka random.

I could imagine that the people implementing that interface purposely give you random order, to make it really clear that you shall not expect any distinct order when using parallel streams.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • 2
    for a terminal operation that preserves the order (and no intermediate operations that would break it), you would always get an ordered output. What you are talking about is the process of *intermediate* operations and `forEach` is not one. I've also read your answer a few times and I can't see it answering the OP's question – Eugene Sep 14 '18 at 08:15
  • 2
    for example *When you process elements of a Stream in parallel, you simply should not expect any guarantees on order*, but `List.of(1,2,3,4).stream().parallel().collect(Collectors.toList())` will preserve the same order on output, even if you are processing elements in a parallel stream. Or *The whole idea is that multiple threads work on different elements of that stream* - this is about intermediate stages, not terminal ones, and the OP is asking about `forEach` (terminal one). I don't know... might be me here, but this is misleading at best – Eugene Sep 14 '18 at 08:21
3

Methods such as findFirst, limit and skip requires the order of input so their behaviour doesn't change whether we use parallel or serial stream. However, forEach as a method do not need any order and thus it's behaviour is different.

For parallel stream pipelines, forEach operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism.

I would also suggest not to use findFirst, limit and skip with parallel streams as it would reduce performance because of overhead required to order parallel streams.

Rishabh Agarwal
  • 1,988
  • 1
  • 16
  • 33
3

Defining a method forEach that would preserve order and unordered that would break it, would complicated things IMO; simply because unordered does nothing more than setting a flag in the stream api internals and the flag checking would have to be performed or enforced based on some conditions.

So let's say you would do:

someStream()
      .unordered()
      .forEach(System.out::println)

In this case, your proposal is to not print elements in any order, thus enforcing unordered here. But what if we did:

someSet().stream()
         .unordered()
         .forEach(System.out::println)

In this case would you want unordered to be enforced? After all, the source of a stream is a Set, which has no order, so in this case, enforcing unordered is just useless; but this means additional tests on the source of the stream internally. This can get quite tricky and complicated (as it already is btw).

To make it simpler there were two method defined, that clearly stipulate what they will do; and this is on par for example with findFirst vs findAny or even Optional::isPresent and Optional::isEmpty (added in java-11).

Eugene
  • 117,005
  • 15
  • 201
  • 306