-3

I have a product, I wanna populate products in another array with the same original order, I used parallel Stream and the result was not ordered with the original order

    List<Product> products = productList.getProducts();
    
    List<ProductModelDTOV2> productModelDTOV2s = new ArrayList<>();
    
    products.parallelStream().forEach(p -> {
        try {
            ProductModelDTOV2 ProductModelDTOV2 = dtoFactoryV2.populate(p, summary);
            productModelDTOV2s.add(ProductModelDTOV2);
        } catch (GenericException e) {
            log.debug(String.format("Unable to populate Product %s", p));
        }
    });
    return productModelDTOV2s;
yali
  • 1,038
  • 4
  • 15
  • 31
  • 1
    You can't have the cake and eat it too... A parallel stream is unordered. And if you want them ordered, the stream is by definition, sequential, which implies "not parallel". – Sweeper Oct 25 '20 at 08:01
  • @Sweeper yes but I have hug of data that need to populate so I need a parallel stream, not a stream – yali Oct 25 '20 at 08:03

2 Answers2

5

It seems like this part of the code can be unordered and be run in parallel:

ProductModelDTOV2 ProductModelDTOV2 = dtoFactoryV2.populate(p, summary);

But this part must be ordered:

productModelDTOV2s.add(ProductModelDTOV2);

What you can do is to separate those two things. Do the first part in a flatMap, and the second part in forEachOrdered:

products.parallelStream().flatMap(o -> { // this block will be done in parallel
    try {
        return Stream.of(dtoFactoryV2.populate(p, summary));
    } catch (GenericException e) {
        // don't expect this message to be printed in order
        log.debug(String.format("Unable to populate Product %s", p));
        return Stream.of();
    }
})
.forEachOrdered(productModelDTOV2s::add); // this will be done in order, non-parallel
Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • 1
    And if you want to get a true benefit from parallel processing use `.collect(Collectors.toList())` instead of `forEachOrdered- . – Holger Oct 26 '20 at 14:37
  • @Holger That because of using `forEachOrdered` waits for next ordered element processing which may cost more, right? or something else ? – Eklavya Oct 26 '20 at 14:39
  • @Holger Does `toList` add the elements in order, as that’s a requirement of the OP? If so, how is it still parallel? – Sweeper Oct 26 '20 at 14:40
  • 1
    That’s [how collectors work](https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/stream/package-summary.html#MutableReduction). Generally, don’t confuse [processing order and encounter order](https://stackoverflow.com/a/29218074/2711488). Maybe [this answer](https://stackoverflow.com/a/41045442/2711488) is helpful, it explains the difference between concurrent and nonconcurrent collection by discussing `toMap` vs `toConcurrentMap`, but the way, a nonconcurrent collector works, applies to `toList` as well. – Holger Oct 26 '20 at 15:01
4

The correct way to do this, would be to have the Stream create the list:

List<Product> products = productList.getProducts();

return products.parallelStream()
        .map(p -> {
            try {
                return dtoFactoryV2.populate(p, summary);
            } catch (GenericException e) {
                log.debug("Unable to populate Product " + p);
                return null;
            }
        })
        .filter(Objects::nonNull)
        .collect(Collectors.toList());
Andreas
  • 154,647
  • 11
  • 152
  • 247