I know where you are hinting at with your question, and I will do my best to explain.
Consider an input list consisting of 8 elements:
[1, 2, 3, 4, 5, 6, 7, 8]
And assume streams would parallellize it in the following way, in reality they do not, the exact process of parallellization is quite difficult to understand.
But for now, assume that they would divide the size by two until a two elements remain.
The branching division would look like this:
First division:
[1, 2, 3, 4]
[5, 6, 7, 8]
Second division:
[1, 2]
[3, 4]
[5, 6]
[7, 8]
Now we have four chunks that will (in our theory) be processed by four different threads, which have no knowledge of eachother.
This can indeed be fixed by synchronizing on some external resource, but then you lose the benefits of parallellization, so we need to assume that we do not synchronize, and when we do not synchronize, the other threads will not see what any other threads have done, so our result will be garbage.
Now onto the part of the question where you ask about statelessness, how could it then be processed in parallel correctly? How can you add elements that are processed in parallel in the correct order to a list?
First assume a simple mapping function where you map with the lambda i -> i + 10
, and then print it with System.out::println
in a foreach.
Now after the second division the following will occur:
[1, 2] -> [11, 12] -> { System.out.println(11); System.println(12); }
[3, 4] -> [13, 14] -> { System.out.println(13); System.println(14); }
[5, 6] -> [15, 16] -> { System.out.println(15); System.println(16); }
[7, 8] -> [17, 18] -> { System.out.println(17); System.println(18); }
There is no guarantee on the order apart from that all elements processed by the same thread (internal state, not to rely upon) get processed in order.
If you want to process them in order, then you need to use forEachOrdered
, which will ensure that all threads operate in the correct order, and you do not lose too much of a parallellization benefit because of this as it applies only to the end state.
To see how you can add items parelllized to an list, take a look at this, by using the Collectors.toList()
, which provides methods for:
- Creating a new list.
- Adding a value to the list.
- Merging two lists.
Now the following will happen after the second division:
For every four threads it will do the following (only showing one thread here):
- We had
[1, 2]
.
- We map it to
[11, 12]
.
- We create an empty
List<Integer>
.
- We add
11
to the list.
- We add
12
to the list.
Now all threads have done this, and we have four lists of two elements.
Now the following merges occur in the specified order:
[11, 12] ++ [13, 14] = [11, 12, 13, 14]
[15, 16] ++ [17, 18] = [15, 16, 17, 18]
- Finally
[11, 12, 13, 14] ++ [15, 16, 17, 18] = [11, 12, 13, 14, 15, 16, 17, 18]
And thus the resulting list is in order and the mapping has been done in parallel. Now you should also be able to see why parallallization needs a higher minimum as just two items, as else the creation of the new lists and merging get too expensive.
I hope you understand now why stream operations should be stateless to get the full benefits of parallellization.