18

I have a class containing a volatile reference to an array:

private volatile Object[] objects = new Object[100];

Now, I can guarantee that, only one thread (call it writer) can write to the array. For example,

objects[10] = new Object();

All other threads will only read values written by the writer thread.

Question: Do I need to synchronize such reads and writes in order to ensure memory consistency?

I presume, yes I should. Because it would not be useful from performance standpoint if JVM provides some kind of memory consistency guarantees when writing to an array. But I'm not sure about that. Didn't find anything helpful in documentation.

Boann
  • 48,794
  • 16
  • 117
  • 146
St.Antario
  • 26,175
  • 41
  • 130
  • 318
  • check out this similar question http://stackoverflow.com/questions/27120914/do-i-need-to-add-some-locks-or-synchronization-if-there-is-only-one-thread-writi – Praveen Kumar Apr 05 '16 at 10:12
  • 1
    If you're in deep enough to have multithreading concerns, chances are you should be using proper collections, not bare arrays. – Pharap Apr 05 '16 at 12:27
  • 1
    IMO, the best answer here is the one by @Boann. They're all correct, but Boann's answer is simple, _and_ correct, and it directly answers your question. – Solomon Slow Apr 05 '16 at 15:23
  • 1
    @jameslarge Except that it was partly wrong. – Boann Apr 05 '16 at 18:12
  • Refer to http://stackoverflow.com/questions/3519664/difference-between-volatile-and-synchronized-in-java for more details. – Ravindra babu Apr 05 '16 at 18:37

4 Answers4

14

You may use AtomicReferenceArray:

final AtomicReferenceArray<Object> objects = new AtomicReferenceArray<>(100);

// writer
objects.set(10, new Object());

// reader
Object obj = objects.get(10);

This will ensure atomic updates and happens-before consistency of read/write operations, the same as if each item of array was volatile.

Alex Salauyou
  • 14,185
  • 5
  • 45
  • 67
  • 4
    Or synchronization, or semaphores, or anything else that will satisfy the JVM Memory Model. – user207421 Apr 05 '16 at 10:29
  • 3
    @EJP sure, but the answer including all possible options will be too large. I tried to fit OP's model (now they use an array and need atomic updates of its items, if I understand the question correctly). – Alex Salauyou Apr 05 '16 at 10:40
  • That such a Collection exists implicitly indicates that the answer to the OP's question is "no". – Raedwald Apr 07 '16 at 07:03
  • @Raedwald You have a point! Besides, there is another powerful feature of `AtomicXxx` classes -- support of [CAS operations](https://en.wikipedia.org/wiki/Compare-and-swap), allowing to implement various lock-free algorithms. – Alex Salauyou Apr 07 '16 at 08:00
13
private volatile Object[] objects = new Object[100];

You make only objects reference to be volatile this way. Not the array instance contents that is associated.

Question: Do I need to synchronize such reads and writes in order to ensure memory consistency?

Yes.

it would not be useful from performance standpoint if JVM provides some kind of memory consistency guarantees when writing to an array

consider using collections like CopyOnWriteArrayList (or your own array wrapper with some Lock implementation inside mutators and read methods).

Java platform also has Vector (obsolete with flawed design) and synchronized List (slow for many scenarios), but I do not recommend to use them.

PS: One more good idea from @SashaSalauyou

Community
  • 1
  • 1
Cootri
  • 3,806
  • 20
  • 30
  • 1
    this link provides more details http://stackoverflow.com/questions/32096419/synconisized-list-map-in-java-if-only-one-thread-is-writing-to-it – Praveen Kumar Apr 05 '16 at 10:19
  • One question. Why did you mention `Vector` as that it's considered a legacy class? I guess just as an example... – St.Antario Apr 05 '16 at 15:20
  • Vector is just a well-known example, yes. Nowadays it is considered as obsolete – Cootri Apr 05 '16 at 15:23
  • 3
    `Vector` is a terrible example, as is `Collections.synchronizedList`. One is obsolete and one is a horribly naive implementation. Take a look at the [`java.util.concurrent`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/package-summary.html) package. This answer is all wrong, really. – Boris the Spider Apr 05 '16 at 17:49
  • what is "all wrong" with two first parts of the answer and the proposal to use COWL or own array wrapper with some Lock implementation in mutators and read methods? – Cootri Apr 05 '16 at 18:38
  • It's not the first part of the answer I have a problem with, it's the recommendation to reinvent the wheel. You picked two straw men, from early Java versions as your examples of concurrent collections in java, then said you didn't recommend them; fine. You also mention the very specialised `CopyOnWriteArrayList`. This is not modern concurrent Java. If you need speed, use a [wait free](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html) collection. – Boris the Spider Apr 05 '16 at 22:46
  • problem is that we don't know exact author's problem, that's why is used "like" word and proposed to use *some* collection. `ConcurrentLinkedQueue` isn't an obvious replacement for an `array` (but a very good choice for a set of problems) – Cootri Apr 06 '16 at 07:22
8

Per JLS § 17.4.5 – Happens-before Order:

Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.

[...]

A write to a volatile field happens-before every subsequent read of that field.

The happens-before relation is quite strong. It means that if thread A writes to a volatile variable, and any thread B later reads the variable, then thread B is guaranteed to see the change to the volatile variable itself, as well as every other change thread A made before setting the volatile variable, including to any other objects whether or not they were otherwise volatile.

However, this is not enough!

The element assignment objects[10] = new Object(); is not a write of the variable objects. It's only a read of the variable to determine the array which it points at, followed by a write to a different variable that is contained within the array object located somewhere else in memory. No happens-before relation is established by mere reads to volatile variables, so that code is not safe.

As @DimitarDimitrov points out, you can kludge around this by doing a dummy write to the objects variable. Each pair of operations – the objects = objects; reassignment by the writer thread coupled with a foo = objects[x]; lookup by a reader thread – defines an updated happens-before relation, and thus will "publish" all of the latest changes made by the writer thread to the reader thread. That can work, but it requires discipline, and it's not elegant.

But there is a more subtle problem with that: Even if the reader thread sees the updated value of the array element that still doesn't guarantee that it sees the fields of the object referred to by that element correctly, because the following order is possible:

  1. Writer creates some object foo.
  2. Writer sets objects[x] = foo;
  3. Reader checks objects[x] and sees the reference to the new object foo (which it can do, although it is not guaranteed to do so since there is no happens-before relationship yet).
  4. Writer does objects = objects;

Unfortunately, this doesn't define the formal happens-before relationship, because the volatile variable read (3) came before the volatile variable write (4). Although the reader can see that objects[x] is the object foo by chance, this doesn't mean that the fields of foo are safely published, so the reader may theoretically see the new object, but with the wrong values! To solve that, the objects you're sharing between threads using this technique would need to have all fields final or volatile or otherwise synchronized. If the objects are all Strings for example, you'll be fine, but otherwise, it is too easy to make mistakes with this. (Thank you @Holger for pointing this out.)


Here are some less flaky alternatives:

  • The concurrent array classes like AtomicReferenceArray exist to provide arrays in which every element behaves as if volatile. This is much easier to use correctly, because it ensures that if a reader sees the updated array element value, it also correctly sees the object referred to by that element.

  • You can wrap all accesses to the array in synchronized blocks, synchronizing on some shared object:

    // writer
    synchronized (aSharedObject) {
        objects[x] = foo;
    }
    
    // reader
    synchronized (aSharedObject) {
        bar = objects[x];
    }
    

    Like volatile, using synchronized creates a happens-before relationship. (Everything a thread does before releasing the synchronization lock of an object happens-before any other thread acquires the synchronization lock of the same object.) If you do this, your array does not need to be volatile.

  • Consider if an array is really what you need here. You haven't said what these writer and reader threads are for, but if you want some kind of producer-consumer queue, then the class you really need is a BlockingQueue or an Executor. You should look around the Java concurrency classes to see if one of them already does what you need, because if one does, it will certainly be easier to use correctly than volatile.

Boann
  • 48,794
  • 16
  • 117
  • 146
  • 1
    A dummy write like `objects = objects;` is *not* sufficient as it can’t guaranty that the reading thread will read the `objects` array reference *after* that write. Since the array reference is the same before and after the write, the reading thread can read the array reference before that write, but after the write of the reference into the array, producing a data race if the object has mutable state. – Holger Apr 05 '16 at 16:51
  • @Holger Oh dear! You're right! I've edited the answer. – Boann Apr 05 '16 at 18:12
  • I saw this kind of `objects = objects` several times. IMO it is quite ugly, because it disallows to make a container `final`. Cannot understand why this idiom is so popular... – Alex Salauyou Apr 06 '16 at 07:46
  • 2
    @Sasha Salauyou: it’s one of the plenty misbelieves around `volatile` variables. It seems, a lot of developers are trying to be clever and most of the time they’re not even checking whether these “clever tricks” gain any performance benefits, not to speak of correctness… As a rule of thumb, if a write has no observable effect (like when writing an old value again, it can’t establish a happens-before relationship with a reader. And if it can’t establish such a relationship in general, a JVM is free to elide it altogether. – Holger Apr 06 '16 at 08:22
  • @Holger *As a rule of thumb, if a write has no observable effect (like when writing an old value again, it can’t establish a happens-before relationship with a reader* - this is just not true :) Even if a `volatile` write uses the current value of the field, the compiler cannot remove the memory coherence side effects, and there will be a *happens-before* edge between that write and subsequent reads in the synchronization order. Please take a look in the comments section here: http://jeremymanson.blogspot.com/2009/06/volatile-arrays-in-java.html where one of the JMM authors explains that. – Dimitar Dimitrov Apr 07 '16 at 10:48
  • @Dimitar Dimitrov: maybe you should read the article your have linked: “*In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable.*”. It’s irrelevant that the dummy write “does provide the volatile write” as it requires the pairing of a volatile write with a subsequent volatile read to establish a *happens-before* relationship. – Holger Apr 07 '16 at 10:59
  • @Dimitar Dimitrov: …and within the comments part the author of that blog says *exactly* what I’m saying: “*If the reader sees the updated value, it has no idea whether the volatile write might have happened (i.e., it can read the value 1 after arr[0] = 1, but before arr = arr). This isn't a big problem with reading a scalar, but it would be a big problem if you were relying on the happens-before relationship for something.*” – Holger Apr 07 '16 at 11:00
  • @Holger You are talking more about the direct usefulness of such a *happens-before* edge, not about its existence. Maybe it will be better to use an example I've sketched. `arr[0] = new ArrayElem(elemData); arr = arr;` Here `ArrayElem` is taking care to safely publish its contents, because all of its fields are `final`. What this guarantees is that you can't see badly initialized array element, but in addition you are guaranteed that you'll see this update no later than the next subsequent read with which the dummy write establishes *happens-before*. – Dimitar Dimitrov Apr 07 '16 at 11:18
  • @Holger and this is better than for example using `sun.misc.Unsafe.putOrderedObject()`. – Dimitar Dimitrov Apr 07 '16 at 11:19
7

Yes, you need to synchronize accesses to the elements of a volatile array.

Other folks have already addressed how you can probably use CopyOnWriteArrayList or AtomicReferenceArray instead, so I'm going to veer off into a slightly different direction. I'd also recommend reading Volatile Arrays in Java by one of the big JMM contributors, Jeremy Manson.

Now, I can gurantee that the only one thread (call it writer) can write to the array as e.g. follows:

Whether you can give single writer guarantees or not is not in any way related to the volatile keyword. I think you didn't have that in mind, but I'm just clarifying, so that other readers don't get the wrong impression (I think there's a data race pun that can be made with that sentence).

All other threads will only read values written by the writer thread.

Yes, but like your intuition correctly lead you, this holds only for the value of the reference to the array. This means that unless you are writing array references to the volatile variable, you won't get the write part of the volatile write-read contract.

What this means is that either you want to do something like

objects[i] = newObj;
objects = objects;

which is ugly and awful in many different ways. Or you want to publish a brand new array each time your writer makes an update, e.g.

Object[] newObjects = new Object[100];

// populate values in newObjects, make sure that newObjects IS NOT published yet

// publish newObjects through the volatile variable
objects = newObjects;

which is not a very common use-case.

Notice that unlike setting array elements, which does not provide volatile-write semantics, getting array elements (with newObj = objects[i];) does provide volatile-read semantics, because you are dereferencing the array :)

Because it would not be useful from performance standpoint if JVM provides some kind of memory consistency guarantess when writing to an array. But I'm not sure about that.

Like you're alluding, ensuring the memory fencing required for volatile semantics will be very costly, and if you add false sharing to the mix, it becomes even worse.

Didn't find anything helpful in documentation.

You can safely assume then that the volatile semantics for array references are exactly the same as the volatile semantics for non-array references, which is not surprising at all, considering how arrays (even primitive ones) are still objects.

Dimitar Dimitrov
  • 16,032
  • 5
  • 53
  • 55
  • 3
    A dummy write like `objects = objects;` is *not* sufficient. A reading thread can read the array reference right between the `objects[i] = newObj;` and the `objects = objects;` statement and at this point, there is no happens-before relationship. It seems to be a common mistake to think a thread performing a `volatile` read will wait until the other thread has performed its `volatile` write. Publishing via a `volatile` variable only works if the application can deal with the possibility of the reader reading the old value, like in the “publish a brand new array” solution. – Holger Apr 05 '16 at 16:57
  • _getting array elements (with newObj = objects[i];) does provide volatile-read semantics_ Why that? We can only say that `objects` references to the actual value, not a stale one. But that's not true about `objects[i]` reference. – St.Antario Apr 05 '16 at 18:00
  • @Holger how is such a dummy write *not* sufficient? I'll go out on a limb and say that in most concurrent programs reads are not done just as a reaction to writes. So even if you use `AtomicReferenceArray`, you can have your writer thread suspended just before it updates a specific element and your reader thread reading what in some context is a stale value. The only difference is that the above example allows an actual data race (whether it can be classified as benign is another question). I think you are right about the common mistake, although I don't see how it's related to my answer. – Dimitar Dimitrov Apr 05 '16 at 18:25
  • @St.Antario It does provide `volatile`-read semantics for the `objects` array. Somewhat simplified, this read *synchronizes-with* the last `volatile` write it sees, and that `volatile` write *happens-before* the `volatile` read. If you have been using the "publish a brand new array" approach, you are guaranteed to see for `objects[i]` a value no older than the one written just before the seen `volatile` write. – Dimitar Dimitrov Apr 05 '16 at 19:01
  • @Dimitar Dimitrov: if you have another synchronizing action that guarantees that the writer is done, that dummy write is obsolete. Without that additional action, the dummy write has no effect. It only establishes a happens-before relationship for *subsequent* reads but without having an observable effect, you can never tell whether a read is subsequent or not. And mind the difference between potentially reading an old (but correct) value and reading an *inconsistent* (e.g. partially written, out-of-order written) value. Only publishing new arrays guarantees consistent values. – Holger Apr 06 '16 at 08:14
  • @Holger can you please clarify why the dummy write has no effect by itself? If there's a *happens-before* edge between a subsequent read and that write, the read sees a value for `objects[i]` no older than `newObj` (the value written before the dummy write). Like I've mentioned in my previous comment, it is a racy read, but your initial comment did not mention anything about that, so I figured you have a different point. If you can clarify that point, we can discuss it further :) I'm also not sure what do you mean by out-of-order written; I assume partially written means word tearing. – Dimitar Dimitrov Apr 06 '16 at 08:25
  • *If* there's a happens-before edge, but if a reader reads the array reference *before* the `objects = objects;` write happens (it can do this because the reference doesn’t change), there is *no* happens-before edge. Without a happens-before edge, it might still see the new array elements, but may see (some) of their fields in uninitialized or inconsistent state. If you don’t understand the implications of improperly published objects or what “out of order writes” means, you shouldn’t answer multi-threading questions. – Holger Apr 06 '16 at 08:40
  • Maybe reading [this](https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4) helps… – Holger Apr 06 '16 at 08:41
  • @Holger thanks for the tips, and the somewhat rude comment :) I have read the JMM, although I think http://shipilev.net/blog/2014/safe-public-construction/ or the related discussions on the concurrency-interest mailing list are a better reference. You do realize that safe publication of an object, including its internals, is responsibility of the object itself, right? If the array elements do not take care about that themselves, synchronization mechanisms applied on their collection won't do much. So your comment is irrelevant to the admittedly awful dummy write and is just bone picking. – Dimitar Dimitrov Apr 06 '16 at 08:52
  • “safe publication of an object, including its internals, is responsibility of the object itself, right?”—it seems you are mixing something up here, because the article you have linked discusses *Singletons* only, in which case the *class* of the object is responsible for the publication of the sole instance. I don’t know about your projects, but in the projects I know, only a minority of objects are Singletons. For all others, the creator of an object is responsible for the correct publication. Unless the object itself is immutable, it can’t do anything regarding that. – Holger Apr 06 '16 at 09:10
  • @Holger Sorry didn't see that comment, I promise it's my last reply :) The article uses Singletons only for illustrating the concepts. The only `static`-specific thing is the static holder idiom. The article also distinguishes between safe publication and safe initialization, and it shows how you can construct an object (not necessarily immutable) so that its contents are always safely initialized. And this way you are not concerned if in your reader thread, there's no *happens-before* between the `volatile` write and your `volatile` read. P.S. Let's continue this on another medium :) – Dimitar Dimitrov Apr 07 '16 at 11:30