Consistency of memory after Java parallel stream worker threads have exited

Question

Given the following code:

final int n = 50;
final int[] addOne = new int[n];
IntStream.range(0, n)
        .parallel()
        .forEach(i -> addOne[i] = i + 1);
// (*) Are the addOne[i] values all visible here?
for (int value : addOne) {
    System.out.println(value);
}

The question: After the worker threads have exited (i.e. at point (*)), can it be guaranteed that the main thread will see all array contents written by the worker threads?

I am interested in understanding what the Java memory model says about the above question. This has nothing to do with the concurrency issues per se (i.e. the fact that parallel streams in Java can process their elements in any order). To preempt some replies, I know that it is impossible to guarantee memory ordering semantics between two different threads with access to the same array element in Java without using something like AtomicReferenceArray<E>. For the purpose of this question, assume that Atomic* classes will not be used by parallel workers. More importantly, note that no two worker threads ever try to write to the same array element, since all the i values are unique. Therefore memory ordering semantics between threads are not important here, only whether any value written to an array element by a worker thread will always be visible to the main thread after the parallel stream has ended.

There is a computational "barrier" between initializing the array elements in the main thread and launching the parallel worker threads (the workers will always initially see elements with their zero initializer value). And there is a completion barrier that waits for all workers to complete at the end of the stream before handing control back to the main thread. So really the question reduces to whether a total ordering or implicit "memory flush barrier" can be assumed when a computational barrier is imposed at the end of a parallel stream.

Asked another way, is there any chance at all that the main thread might read the default initialization value of 0 for some element after point (*)? Or will the CPU cache hierarchy always ensure that the main thread will see the most recent value written to the array by a worker thread, even if that value hasn't been flushed out of the CPU cache back to RAM yet?

I am assuming for the purposes of this question that it takes zero time to return control to the main thread after the parallel stream has completed, so there is no race condition that happens to cause the array values to be flushed to RAM due to the time it takes to shut down the parallel stream, or due to the amount of cache eviction that has to take place to shut down the parallel stream.

Does this answer your question? [Why does Collection.parallelStream() exist when .stream().parallel() does the same thing?](https://stackoverflow.com/questions/24603186/why-does-collection-parallelstream-exist-when-stream-parallel-does-the-sa) — vicpermir, Feb 21 '20 at 12:31
By the way, if `N` is not a constant, it should be written lowercase... — dan1st, Feb 23 '20 at 09:12
Since encounter order does _not_ play a role in this, the question boils down to whether the consumer given in the `forEach` call has been executed for all elements of the `Stream`. I'm sure this is the case (although I won't be looking up an authoritative answer for it). — daniu, Feb 24 '20 at 14:09
I can't close this as a duplicate, but [here you go](https://stackoverflow.com/questions/53906027/does-collection-parallelstream-imply-a-happens-before-relationship) — Eugene, Apr 13 '20 at 04:04
Thanks, yes, I verified this on the jdk-dev mailing list, and this comment pretty much sums up the answers I got: https://stackoverflow.com/questions/53906027/does-collection-parallelstream-imply-a-happens-before-relationship#comment94677200_53906027 — Luke Hutchison, Apr 14 '20 at 07:04

Aleksandr Semyannikov · Answer 1 · 2020-02-21T14:04:48.747

0

JMM says:

All instance fields, static fields, and array elements are stored in heap memory. In this chapter, we use the term variable to refer to both fields and array elements

That means you need to make sure there is happens-before relationship between writing and reading array elements.

Javadoc of method java.util.stream.IntStream#forEach says:

For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses shared state, it is responsible for providing the required synchronization

That means you should enforce a happens-before relationship between writing and reading array elements, so there is no guarantee that the main thread will see all array contents written by the worker threads.

PS: Streams is a complex framework, and, actually, I am not sure if it's really unsafe in your particular situation, but contract says that there is no guarantee if you access shared state(your array is shared between caller thread and workers), and it's better to follow contract.

edited Feb 21 '20 at 14:04

answered Feb 21 '20 at 13:19

Aleksandr Semyannikov

1,344
9
21

The second section you quoted says the elements may be processed in any order, which has nothing to do with the Java memory model, only that the implementation of parallel streams reserves the right to work through the elements in any order it wants. When the stream is finished, all elements will have been processed, but this leaves the memory consistency question unresolved. You said: "you should enforce a happens-before relationship between writing and reading array elements". There is already a completion barrier at the end of the stream. Does this not yield a happens-before relationship? – Luke Hutchison Feb 22 '20 at 02:35
@LukeHutchison, you missed the part "If the action accesses shared state, it is responsible for providing the required synchronization", your array's elements is a shared state. That part about processing order is not important for your question. – Aleksandr Semyannikov Feb 22 '20 at 09:50
I understand what you're saying, but shared state only affects situations where you have anything other than one writer _or_ any number of readers at a given time. As soon as you mix readers and writers, and/or have multiple writers for a single piece of memory, you have to provide synchronization for the shared state. This is a standard and generic principle of concurrency, and has nothing to do with Java's memory model per se. What I want to know is if the CPU caches can be assumed to be consistent at the end of the stream, so the global thread sees the latest cache values. – Luke Hutchison Feb 23 '20 at 02:44
@LukeHutchison, there are two writers of each element, first, each element is set to 0 at main thread during array initializing, then it is changed by one of worker thread. – Aleksandr Semyannikov Feb 23 '20 at 07:17
You can have any number of writers. But you cannot have two or more _concurrent_ writers. There is a strict total ordering ("happens-after") between all the initialization writes happening, then all the worker threads launching. There is also a strict total ordering between the worker threads writing and the reads after the stream is completed. There is no ordering whatsoever between the different worker writes. But _for any one specific array element_, there is a total ordering between when the value is initialized, then overwritten, then later read. There is no confusion about total order. – Luke Hutchison Feb 23 '20 at 08:39

diginoise · Answer 2 · 2020-02-25T10:18:12.030

-1

Reviewed answer:

Fork-Join pool which is where the execution of the pipeline after parallel() happens has fork() invoke() and join() steps and the last step in that sequence join() is semantically equivalent to Thread.join(), which means that there is happens-before semantics between the parallel task executed by the Fork-Join pool and the statement after it.

edited Feb 25 '20 at 10:18

answered Feb 24 '20 at 13:57

diginoise

7,352
2
31
39

There's no need for a custom thread pool. A parallel stream never returns control back to the calling thread until all worker threads have become quiescent after completing processing of all stream elements. Yes, `Future` can be used to produce an absolute ordering between a writer and readers, but that's not the question I'm asking here. – Luke Hutchison Feb 25 '20 at 00:10
@LukeHutchison I get what you mean now - write visibility in presence of potential operations reordering and what guarantees it. – diginoise Feb 25 '20 at 09:51

Consistency of memory after Java parallel stream worker threads have exited

2 Answers2