I'm trying to understand why the following Java program gives an OutOfMemoryError
, while the corresponding program without .parallel()
doesn't.
System.out.println(Stream
.iterate(1, i -> i+1)
.parallel()
.flatMap(n -> Stream.iterate(n, i -> i+n))
.mapToInt(Integer::intValue)
.limit(100_000_000)
.sum()
);
I have two questions:
What is the intended output of this program?
Without
.parallel()
it seems that this simply outputssum(1+2+3+...)
which means that it simply "gets stuck" at the first stream in the flatMap, which makes sense.With parallel I don't know if there is an expected behaviour, but my guess would be that it somehow interleaved the first
n
or so streams, wheren
is the number of parallel workers. It could also be slightly different based on the chunking/buffering behaviour.What causes it to run out of memory? I'm specifically trying to understand how these streams are implemented under the hood.
I'm guessing something blocks the stream, so it never finishes and is able to get rid of the generated values, but I don't quite know in which order things are evaluated and where buffering occurs.
Edit: In case it is relevant, I'm using Java 11.
Editt 2: Apparently the same thing happens even for the simple program IntStream.iterate(1,i->i+1).limit(1000_000_000).parallel().sum()
, so it might have to do with the lazyness of limit
rather than flatMap
.