18

Consider the following example:

    IntStream.of(-1, 1)
             .parallel()
             .flatMap(i->IntStream.range(0,1000).parallel())
             .forEach(System.out::println);

Does it matter whether I set the inner flag to parallel? The results look very similar if I leave it away or not.

Also why does the code (ReferencePipeline) sequentialize the mapping?

I am confused by the line:

result.sequential().forEach(downstream);
Gurwinder Singh
  • 38,557
  • 6
  • 51
  • 76
Benedikt Bünz
  • 648
  • 7
  • 22
  • 1
    Well, based on the comment in the code `We can do better that this too; optimize for depth=0 case and just grab spliterator and forEach it`, I'm assuming that they didn't have to implement it as `result.sequential().forEach(downstream)` and could have used parallel implementation for better performance. – Eran Jun 25 '14 at 14:49

2 Answers2

17

In the current JDK (jdk1.8.0_25), the answer is no, it doesn't matter you set the inner flag to parallel, because even you set it, the .flatMap() implementation set's back the stream to sequential here:

result.sequential().forEach(downstream);

("result" is the inner stream and it's sequential() method's doc says: Returns an equivalent stream that is sequential. May return itself, either because the stream was already sequential, or because the underlying stream state was modified to be sequential.)

In most cases there could be no effort to make the inner stream parallel; if outer stream has at least same number of items as number of threads that can run parallel (ForkJoinPool.commonPool().getParallelism() = 3 in my computer).

Gurwinder Singh
  • 38,557
  • 6
  • 51
  • 76
Daniel Hári
  • 7,254
  • 5
  • 39
  • 54
  • 1
    I slightly disagree with the last statement. It is only true if the computational load per outer element is roughly equal. – Benedikt Bünz Dec 06 '16 at 10:34
  • Assume you are making a web scraper, the outer stream being over websites and the inner over their words or something. If one of the websites is much larger than the rest, this might mean you end up with a basically sequential program. Is that understood correctly? – Thomas Ahle Jan 30 '20 at 00:11
0

For anyone like me, who has a dire need to parallelize flatMap and needs some practical solution, not only history and theory.

The simplest solution I came up with is to do flattening by hand, basically by replacing it with map + reduce(Stream::concat).

Already posted an answer with details in another thread: https://stackoverflow.com/a/66386078/3606820

Dmytro Buryak
  • 348
  • 2
  • 6