0

I have 1000 products which needs to executed in parallel and result of the output is also a collection. I need to maintain the order, so I used parallelStream() with forEachOrderedlike below

products.parallelStream().forEachOrdered(pdt -> 
 finalProd.add(mapProduct(pdt)));

I have tested the code and it seems it is not efficient but better than using stream(). As per the oracle documentation

the method forEachOrdered, which processes the elements of the stream in the order specified by its source, regardless of whether you executed the stream in serial or parallel. Note that you may lose the benefits of parallelism if you use operations like forEachOrdered with parallel streams.

we are not using full potential of parallelism if we use forEachOrdered , Can someone suggest a better way to handle parallel task? I am worried about order only in last output, not concerned about order of execution

Vishnu T S
  • 3,476
  • 2
  • 23
  • 39
  • We can't reproduce your issue with the information you've provided. – nicomp Apr 27 '23 at 02:49
  • @nicomp We dont have to reproduce this, oracle document itself says forEachOrdered is not efficient with parallelStream – Vishnu T S Apr 27 '23 at 02:55
  • 2
    When you say 'I have tested the code but it seems more efficient', what does that mean? Did you write 2 test cases and ran them with [JMH](https://www.baeldung.com/java-microbenchmark-harness)? If the answer is no - then it is likely you have no basis to make this claim. – rzwitserloot Apr 27 '23 at 02:57
  • @rzwitserloot I have not used bench mark, trying only with multiple test cases and StopWatch – Vishnu T S Apr 27 '23 at 03:06
  • @VishnuTS it is OK to assume that `finalProd` is supposed to have the mapped products in the same order as `products`? – Andrés Alcarraz Apr 27 '23 at 03:31
  • 1
    @AndrésAlcarraz yes expecting mapped products should have same as products – Vishnu T S Apr 27 '23 at 03:35
  • Ok @VishnuTS, I added an answer, with some explanation, there are comments with unsupported claims that it won't respect the order, but I didn't find anywhere that's not true. So if you test it and it works, please accept it. – Andrés Alcarraz Apr 27 '23 at 03:41
  • Does this answer your question? [Why parallel stream get collected sequentially in Java 8](https://stackoverflow.com/questions/29709140/why-parallel-stream-get-collected-sequentially-in-java-8) – Hulk Apr 27 '23 at 04:00
  • 1
    As @AndrésAlcarraz pointed out, most collect and reduce operations do maintain encounter order unless source or target are unordered. This comes at a cost, however - performance may be somewhere between forEach and forEachOrdered. – Hulk Apr 27 '23 at 04:05

1 Answers1

1

This assumes that finalProd is supposed to have the products in products mapped and in the same order:

finalProd = products
     .parallelStream()
     .map(p-> mapProduct(p))
     .collect(Collectors.toList())

The parallelStream() method creates a parallel stream to be processed concurrently, the map operation maintains the order of the stream and finally the collector collects them to a list which respects the order also.

You can check this claim in the stream package javadoc: Ordering

Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]...

...

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant. Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations. However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.

Emphasis is mine, but since we are only applying operations that respect the order, we can ensure the resulting order.

Andrés Alcarraz
  • 1,570
  • 1
  • 12
  • 21
  • 2
    This would not guarantee that the data is processed in order, let alone turns into a list that holds the same order, and you're making wild stabs in the dark about what `finalProd` even is, here. If you don't have enough information, don't just toss a random answer out. – rzwitserloot Apr 27 '23 at 02:58
  • @rzwitserloot Can you say why, linking to a source, or demonstrate with an example why it does not respect the order? About the assumption is quite reasonable given the OP's question. – Andrés Alcarraz Apr 27 '23 at 03:13
  • In other words, the burden of proof is on you, not on me. Without specced proof that this works, the correct act is to presume it won't work - even if running it on a JVM right now appears to indicate that it does. – rzwitserloot Apr 27 '23 at 03:46
  • 1
    @rzwitserloot The ordered streams guarantee ordered processing regardless of it being processed in parallel or sequentially, if the code of the mapping function has race conditions is not in the scope of the question or the answer. Just read the javadoc to see a proof of what I claim https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#Ordering. – Andrés Alcarraz Apr 27 '23 at 03:48
  • 3
    @rzwitserloot the OP has clarified that only the order in the result is important. Encounter order will be preserved by the toList-Collector. See https://stackoverflow.com/questions/29709140/why-parallel-stream-get-collected-sequentially-in-java-8 – Hulk Apr 27 '23 at 03:56
  • @AndresAlcarraz wrote: _The ordered streams guarantee_ - uh, okay. That's nice. This answer does not use an ordered stream, your comment isn't relevant to this answer. – rzwitserloot Apr 27 '23 at 13:52
  • @rzwitserloot The `products` list is ordered, otherwise it wouldn't make sense to respect that order. It's implicit to the question. And an ordered collection produces an ordered stream. – Andrés Alcarraz Apr 27 '23 at 14:06