15

In CopyOnWriteArrayList.java, in the method set(int index, E element) below:

public E set(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        Object oldValue = elements[index];

        if (oldValue != element) {
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);----? Why this call required?
        }
        return (E)oldValue;
    } finally {
        lock.unlock();
    }
}

Why the call to setArray is required? I couldn't understand the comment written above that method call. Is it because we are not using synchronised block, we have to flush manually all the variable we use? In the above method they are using re-entrant locks. If they had used synchronised statement do they still need to call setArray method?. I think no.

Question2: If we end up in else, it means we didn't modified elements array, then why we need to flush the value of variable array?

Lii
  • 11,553
  • 8
  • 64
  • 88
Pushparaj
  • 1,039
  • 1
  • 6
  • 26

5 Answers5

10

This code uses deep Java Memory Model voodoo, as it mixes both locks and volatiles.

The lock usage in this code is easy to dispense with, though. Locking provides memory ordering among threads that use the same lock. Specifically, the unlock at the end of this method provides happens-before semantics with other threads that acquire the same lock. Other code paths through this class, though, don't use this lock at all. Therefore, the memory model implications for the lock are irrelevant to those code paths.

Those other code paths do use volatile reads and writes, specifically to the array field. The getArray method does a volatile read of this field, and the setArray method method does a volatile write of this field.

The reason this code calls setArray even when it's apparently unnecessary is so that it establishes an invariant for this method that it always performs a volatile write to this array. This establishes happens-before semantics with other threads that perform volatile reads from this array. This is important because the volatile write-read semantics apply to reads and writes other than those of the volatile field itself. Specifically, writes to other (non-volatile) fields before a volatile write happen-before reads from those other fields after a volatile read of the same volatile variable. See the JMM FAQ for an explanation.

Here's an example:

// initial conditions
int nonVolatileField = 0;
CopyOnWriteArrayList<String> list = /* a single String */

// Thread 1
nonVolatileField = 1;                 // (1)
list.set(0, "x");                     // (2)

// Thread 2
String s = list.get(0);               // (3)
if (s == "x") {
    int localVar = nonVolatileField;  // (4)
}

Let's assume that line (3) gets the value set by line (2), the interned string "x". (For the sake of this example we use identity semantics of interned strings.) Assuming this is true, then the memory model guarantees that the value read at line (4) will be 1 as set by line (1). This is because the volatile write at (2), and every earlier write, happen-before the volatile read at line (3), and every subsequent read.

Now, suppose that the initial condition were that the list already contained a single element, the interned string "x". And further suppose that the set() method's else clause didn't make the setArray call. Now, depending on the initial contents of the list, the list.set() call at line (2) might or might not perform a volatile write, therefore the read at line (4) might or might not have any visibility guarantees!

Clearly you don't want these memory visibility guarantees to depend upon the current contents of the list. To establish the guarantee in all cases, set() needs to do a volatile write in all cases, and that's why it calls setArray() even if it didn't do any writing itself.

EDIT 2022-07-13

Holger raised an interesting issue in the comments:

If by the time, thread 1 does list.set(0, "x"); , the first element is already "x", the scenario we’re talking about, then the thread 2 can not assume that list.get(0) == "x" proved that thread 1 did perform list.set(0, "x");, as the condition is always fulfilled whether thread 2’s read was subsequent to thread 1’s write or not. So if the element doesn’t change, there is no happens-before relationship between (1) and (4) here. The redundant setArray call isn’t enforcing a memory visibility either, as the reader thread could have read the array reference right before that write.

It is true that, looking only at this code in isolation, there is no guarantee that the set at (2) is performed before the get at (3). However, these are operations on a volatile variable, and as such, they are synchronization actions. Under the JMM, synchronization actions have a total order. That is, they will occur in some order, but we don't know which one. The operations could occur in the order (2)->(3) or (3)->(2); there are no other possibilities.

If the order is (3)->(2) then Holger is correct, there is no happens-before relationship, and a subsequent read such as (4) could get a stale value.

However, if the order is (2)->(3) then there is a happens-before relationship, and the read at (4) is guaranteed to see the write at (1).

Isn't this pointless, though, since we can't guarantee the order in which the synchronization actions are performed? And to establish that order, usually we would use some synchronization operations between the threads, which would provide the necessary memory visibility guarantees. Doesn't that make the unconditional volatile write at (2) useless?

Not necessarily. There are mechanisms external to the system, such as timers, network messages, or user interaction, that can clearly establish ordering between certain operations, but that don't establish memory visibility. For example, suppose Thread 1 performs its operation frequently (say, once per second) and Thread 2 performs its operation (say, once per minute). Our application might want Thread 2 to get some recent value, though not necessarily the absolute most recent value. The volatile write performed repeatedly by Thread 1 (and the corresponding volatile read by Thread 2) ensure that Thread 2 sees the 59th or 60th update from Thread 1. If Thread 1 weren't performing any volatile writes, Thread 2 might see an arbitrarily old value.

This is an extremely narrow edge case, but I think it establishes the need for CopyOnWriteArrayList::set to perform its volatile write unconditionally.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • Are you saying that unlock() doesn't establish "happens-before semantics with other threads"? – Peter Lawrey Feb 28 '15 at 12:05
  • `unlock()` always sets a `volatile` making this redundant. (See my answer) – Peter Lawrey Feb 28 '15 at 12:29
  • 3
    @PeterLawrey For the unlock to establish proper memory semantics with another thread, that other thread must take *the same lock*. Similarly with volatile: to establish ordering, thread 1 must do a volatile write and thread 2 must do a volatile read *of the same volatile variable* not just any volatile variable. – Stuart Marks Feb 28 '15 at 16:53
  • @StuartMarks Thanks for backing me up. I thought that was clear from the JMM spec. – Erwin Bolwidt Mar 01 '15 at 06:42
  • @StuartMarks The whole point of happens-before, happens-after relationships is to establish what happens before/after all memory access either side of a memory fence which is not specific to a memory location. http://openjdk.java.net/jeps/171 implemented in Java 8, shows that memory fences are not specific to a memory location. – Peter Lawrey Mar 01 '15 at 10:38
  • 2
    @PeterLawrey The JMM (specified in JLS 17.4, http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4) describes "synchronization actions," which can establish "synchronizes-with" relationships, which establish happens-before semantics. Crucially, only volatile writes and reads to the same variable are specified to provide a synchronizes-with relationships (JLS 17.4.4, 2nd bullet). The JLS does not discuss fences. Note also that the JMM applies to memory reorderings that can occur not only because of processors and caches but also JIT compilers. – Stuart Marks Mar 01 '15 at 22:40
  • 2
    @PeterLawrey Fences can be used to **implement** memory ordering semantics. Indeed, JEP 171 is about implementation. Fences, as you note, typically apply ordering to all memory operations, not just operations on the same volatile variable as spec'd by the JMM. Thus fences' semantics differ from and are stronger than the JMM's semantics. If the `setArray` were removed from the `else` clause, it would probably work in many (most? all?) **current** JVM implementations, but a future JIT optimization legal under the JMM might break it. Thus the call is necessary to hew to the JMM's semantics. – Stuart Marks Mar 01 '15 at 22:46
  • It appears I have confused what the JLS specifies and how volatile happens to be implemented on x64 as these processors don't have an independent happen - relationship for every memory location AFAIK. It only support whole memory semantics. – Peter Lawrey Mar 02 '15 at 05:07
  • I understand how we combine a volatile variable and a lock, but I don't understand the else condition. If we haven't changed anything, what should be made visible then? – Konstantin Milyutin May 04 '16 at 14:46
  • 1
    @damluar In the else-case, it's true that the `set` method itself hasn't changed anything. However, the caller might have written to a non-volatile field prior to calling `set`. The `set` method always wants to do a volatile write, even if the method itself hasn't done any writes of its own, to ensure that any earlier non-volatile writes are made visible to other threads. That's what my example illustrates. – Stuart Marks May 12 '16 at 00:51
  • @StuartMarks, is it really a good idea to rely on such side-effects of a class? For me it seems like we are relying on internal implementation of some other class that might change later. – Konstantin Milyutin Nov 03 '16 at 22:16
  • 1
    @damluar The [CopyOnWriteArrayList specification](http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CopyOnWriteArrayList.html) says "actions in a thread prior to placing an object into a CopyOnWriteArrayList happen-before actions subsequent to the access or removal of that element from the CopyOnWriteArrayList in another thread." The apparently unnecessary call to `setArray`, providing volatile write semantics, is required by the specification. – Stuart Marks Nov 03 '16 at 23:41
  • 1
    @damluar Now as to why the specification requires this, establishing a *happens-before* relationship is fundamental in order to communicate information reliably between threads. If the spec didn't require this, it would give rise to data races, meaning that the reading thread could never be sure that it wasn't seeing out-of-date values. – Stuart Marks Nov 03 '16 at 23:44
  • 2
    If by the time, thread 1 does `list.set(0, "x");` , the first element is already `"x"`, the scenario we’re talking about, then the thread 2 can *not* assume that `list.get(0) == "x"` proved that thread 1 did perform `list.set(0, "x");`, as the condition is *always* fulfilled whether thread 2’s read was subsequent to thread 1’s write or not. So if the element doesn’t change, there is no *happens-before* relationship between (1) and (4) here. The redundant `setArray` call isn’t enforcing a memory visibility either, as the reader thread could have read the array reference right before that write. – Holger Jun 10 '22 at 14:51
  • 1
    @Holger Good question! I've edited my answer. Sorry for the delay, vacation, etc. – Stuart Marks Jul 14 '22 at 01:45
  • 1
    I don’t mind if the response to a comment posted six years later takes a few weeks :-) Regarding ordering of external events, I still doubt that they are relevant to the *synchronization order*, when the JMM doesn’t mention them. Most notably, elapsed time does not count at all, as the specification says explicitly that `Thread.sleep(…)` does not have any synchronization semantics. Making external clocks agree on the actual time is another topic. That’s why we have the term “synchronization order” in the first place, because it doesn’t have to match the temporal order. – Holger Jul 14 '22 at 09:11
3

TLDR; The call to setArray is required to provide the guarantee specified in the Javadoc of CopyOnWriteArrayList (even when the contents of the list is not changed)


CopyOnWriteArrayList has a memory-consistency guarantee specified in the Javadoc:

Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a CopyOnWriteArrayList happen-before actions subsequent to the access or removal of that element from the CopyOnWriteArrayList in another thread.

The call to setArray is necessary to enforce this guarantee.

As the Java Memory Model specification in the JLS states:

A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.

So the write to array (using the setArray) method is necessary to ensure that other threads reading from the list now have a happens-before (or rather, happens-after) relationship with the thread that called the set method, even when the element in the set method was already identical (using ==) with the element that was already in the list at that position.

Updated explanation

Going back to the guarantee in the Javadoc. There is this order of things (assuming an access, not a removal, as the last action - a removal is already taken care of because of the use of lock, but an access doesn't use lock) :

  1. An action in thread A prior to placing an object into the CopyOnWriteArrayList
  2. Placing and object into the CopyOnWriteArrayList (presumably on thread A, although the Javadoc could be clearer about this)
  3. Accessing [reading] an element from the CopyOnWriteArrayList on thread B

Assuming that step 2 places an element into the list that was already there, we see that the code goes into this branch:

} else {
    // Not quite a no-op; ensures volatile write semantics
    setArray(elements);
}

This call to setArray ensures a volatile write on field array from thread A. Since thread B will do a volatile read on field array, a happens-before relationship is created between thread A and thread B, which wouldn't have been created if the else-branch wasn't there.

Erwin Bolwidt
  • 30,799
  • 15
  • 56
  • 79
3

In JDK 11 this useless operation is allready removd from the source code. see code below.

//code from JDK 11.0.1
public E set(int index, E element) {
    synchronized (lock) {
        Object[] es = getArray();
        E oldValue = elementAt(es, index);

        if (oldValue != element) {
            es = es.clone();
            es[index] = element;
            setArray(es);
        }
        return oldValue;
    }
}

Update: As @Stuart Marks commented, in new JDK version, the operation of setArray(es) was backported. The key point is CopyOnWriteArrayList it's self has no required call the setArray(es) when the elements is exist , but when some other variable relay on the happens before rule of set(int index, E element) the setArray(es) will required. The JDK-8221120 description has a mail list link discuss the issue with a very good example.

Allen
  • 31
  • 4
  • 2
    Note that the operation was restored in the JDK mainline and that this was backported to JDK 11. See [source code](http://hg.openjdk.java.net/jdk-updates/jdk11u/file/113c646a33d2/src/java.base/share/classes/java/util/concurrent/CopyOnWriteArrayList.java#l401) and bug [JDK-8221120](https://bugs.openjdk.java.net/browse/JDK-8221120). – Stuart Marks Jun 10 '21 at 16:19
1

I believe it is because other methods that read the array do not obtain the lock, so there is no guarantee of happens before ordering. The way to preserve such ordering is to update the volatile field which does guarantee such ordering. (This is the write semantics that it is referring to)

Darren
  • 722
  • 4
  • 12
0

It is not required AFAICS. This is for two reasons.

  • write semantics are only need if you perform a write, this this doesn't.
  • the lock.unlock() performs write semantics in a finally block, unavoidably.

The method

lock.unlock()

always calls through to

private volatile int state;

protected final void setState(int newState) {
    state = newState;
}

and this gives the has happens before semantics as setArray() making the set array redundant. You might claim that you don't want to depend on the implementation of ReentrantLock, but if you are worried that a future version of ReentrantLock is not thread safe, you might have bigger problems if this is the case.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • I'm still puzzled by this problem and I wonder if you could elaborate on "write semantics." I think there must be some good reason for having the `setArray` call there, even if it has no effect but, perhaps, satisfies some strict interpretation of the Java spec. – Lucas Ross Feb 27 '15 at 20:44
  • It's not required for internal consistency of CopyOnWriteArrayList, but it is required to meet the memory consistency guarantees given by the Javadoc to the callers of the methods (see my answer) – Erwin Bolwidt Feb 28 '15 at 02:44
  • @ErwinBolwidt There is no way to say memory "A" has a happens before guarantee but memory "B" doesn't. When you perform a memory barrier (as unlock does) this applies to all memory. – Peter Lawrey Feb 28 '15 at 12:04
  • @PeterLawrey agree. I didn't state that it does so I'm a bit confused why you are saying this. – Erwin Bolwidt Feb 28 '15 at 13:24
  • @PeterLawrey Maybe the confusion is about the volatile write - volatile writes and reads only set up a happens-before relationship when they are to the *same* field (see the Java Memory Model specification) - not just to any volatile field – Erwin Bolwidt Feb 28 '15 at 13:29
  • You have to code it according to the spec. You cannot make assumptions about memory barriers / architecture etc. – Darren Feb 28 '15 at 14:51