I've read a book on Java 8 and the book says that using parallel streams in order to get range of numbers via IntStream.range(0,someNumber)
can be slower than sequential one...why is that?

- 127,867
- 37
- 205
- 259

- 2,140
- 4
- 27
- 33
-
3http://stackoverflow.com/q/26838242/6309 and http://stackoverflow.com/a/21969069/6309 should help here. – VonC May 02 '15 at 05:59
-
Why do you assume that using more threads should be faster? – Peter Lawrey May 02 '15 at 08:46
-
As far as I remember it Java 8 in action. I assume more thread should be faster since each thread would produce partial data and only the combination of all partial chunks of data will cause some overhead... – user1409534 May 02 '15 at 09:12
-
1Perhaps you could be more specific. There is an entire chapter on parallel processing in *Java 8 In Action*, and there are several places where `IntStream.range` and similar methods are used. – Stuart Marks May 02 '15 at 09:33
-
1Don't believe everything you read. – Brian Goetz May 02 '15 at 14:02
4 Answers
Whatever you took away from this book is simply wrong (or, at best, a gross oversimplification.) Since you don't say what book, we don't know whether the book is wrong or you just misunderstood.
Whether you get a parallel speedup is a function of many things; the splittability of the source, the operations on your stream, the work done by your behavioral parameters, and your hardware. Having an unsplittable source can definitely kill parallel performance; for example, a LinkedList
is unlikely to parallelize well.
This talk goes into greater detail of what factors using parallelism might speed up your computation, slow it down, or neither, and how to recognize the likely parallel behavior of a stream pipeline.
Where the book (or your interpretation) goes wrong is putting the blame on IntStream.range
; it is one of the best-splitting sources. So, if you've got a pipeline that's not parallelizing well, it is definitely not because you used IntStream.range
as a source, but it could be for any number of other reasons (too little data, high merge costs in the terminal operation, etc.)
There's no such thing as magic parallelism dust; streams can make it easy for you to write parallel code but doesn't absolve you of the need to understand the parallel cost model. But if someone is telling you that IntStream.range
is the problem, I suggest you stop listening to them -- this is dangerously wrong advice.

- 90,105
- 23
- 150
- 161
-
https://stackoverflow.com/questions/26838242/why-does-intstream-range0-100000-parallel-foreach-take-longer-then-normal-for – Zombies Dec 19 '19 at 14:01
-
3@Zombies The "benchmark" in that example is worse than worthless. Not only is the number meaningless (the worthless part), it got you to conclude something that is not true (this is the "worse" part), because it is so hard for humans to see a measurement number and not imbue it with meaning. That benchmark doesn't measure what you think, or what the author thought. – Brian Goetz Dec 20 '19 at 16:16
It can be slower. You should always use sequential streams by default. A parallel stream has a much higher overhead compared to a sequential one as it needs a lot of coordination efforts internally along with certain book keeping activities.
You should consider parallel ones if: 1. You have a huge amount of items to process and each item takes substantial time and can be parallelized. 2. If you have for a performance problem in first place. So golden rule is always benchmark before trying parallel streams or any other concurrency constructs.
In your case if the range is very small then in that case overhead associated with parallel streams can override the benefit you are supposed to get. Also check this article: http://zeroturnaround.com/rebellabs/java-parallel-streams-are-bad-for-your-health/

- 23,309
- 7
- 96
- 95
It depends a lot on the scenario. There is an overhead on the processor to spin up the parallel threads and process them. So if the process you wish to run does not take much time, then it would be a waste of CPU cycles.
On the other hand I was writing a data generation method. For this I was making use of parallel streams and it improved the performance at least 4 times. In my case each parallel thread was responsible for doing multiple reads and writes in databases, so the processing time for each thread was pretty high.
P.S.: You can set the property: java.util.concurrent.ForkJoinPool.common.parallelism. This will be useful if the process you write is dependent on the number of parallel threads. By default .parallel() will spin up a single thread per core.

- 21
- 5
Before using parallelStream ()
, Read this:
- It is multi-threaded. Just writing parallelStream() to get parallelism is almost always bad idea in java. There are some cases where it will work, but not always. There are other ways to achieve parallelism and almost always, you need to think a lot before taking a multi-thread solution .
- It uses the default JVM thread pool. So, if you are doing any blocking operation such as network call, the entire java application can get stuck. Thats the biggest problem there. There are other ones with task allocation as well. A simple ExecutionService with
n
threads provides better performance that parallel streams.
You can also read: Java Parallel Streams Are Bad for Your Health! | JRebel by Perforce

- 471
- 2
- 11