14

I just ran across an article that makes a claim I have never heard before and cannot find anywhere else. The claim is that from the perspective of another thread, the assignment of the value returned by a constructor may be reordered with respect to instructions inside the constructor. In other words, the claim is that in the code below, another thread could read a non-null value of a in which the value of x has not been set.

class MyInt {
   private int x;

   public MyInt(int value) {
      x = value;
   }

   public int getValue() {
      return x;
   }
}

MyInt a = new MyInt(42);

Is this true?

Edit:

I think it's guaranteed that from the perspective of the thread executing MyInt a = new MyInt(42), the assignment of x has a happens-before relationship with the assignment of a. But both of these values may be cached in registers, and they may not be flushed to main memory in the same order they were originally written. Without a memory barrier, another thread could therefore read the value of a before the value of x has been written. Correct?

So based on axtavt's answer and the comments that follow, are these assessments of thread safety correct?

// thread-safe
class Foo() {
   final int[] x;

   public Foo() {
      int[] tmp = new int[1];
      tmp[0] = 42;
      x = tmp; // memory barrier here
   }
}

// not thread-safe
class Bar() {
   final int[] x = new int[1]; // memory barrier here

   public Bar() {
      x[0] = 42; // assignment may not be seen by other threads
   }
}

If that's correct... wow, that's really subtle.

Community
  • 1
  • 1
Kevin Krumwiede
  • 9,868
  • 4
  • 34
  • 82

3 Answers3

9

The article you cited is conceptually correct. It's somewhat imprecise in its terminology and usage, as is your question, and this leads to potential miscommunication and misunderstandings. It may seem like I'm harping on terminology here, but the Java Memory Model is very subtle, and if the terminology isn't precise, then one's understanding will suffer.

I'll excerpt points from your question (and from comments) and provide responses to them.

The assignment of the value returned by a constructor may be reordered with respect to instructions inside the constructor.

Almost yes... it isn't instructions but memory operations (reads and writes) that may be reordered. A thread could execute two write instructions in a particular order, but the arrival of the data in memory, and thus the visibility of those writes to other threads, may occur in a different order.

I think it's guaranteed that from the perspective of the thread executing MyInt a = new MyInt(42), the assignment of x has a happens-before relationship with the assignment of a.

Again, almost. It is true that in program order is that the assignment to x occurs prior to the assignment to a. However, happens-before is a global property that applies to all threads, so it doesn't make sense to talk about happens-before with respect to a particular thread.

But both of these values may be cached in registers, and they may not be flushed to main memory in the same order they were originally written. Without a memory barrier, another thread could therefore read the value of a before the value of x has been written.

Yet again, almost. Values can be cached in registers, but parts of the memory hardware such as cache memory or write buffers can also result in reorderings. Hardware can use a variety of mechanisms to change ordering, such as cache flushing or memory barriers (which generally don't cause flushing, but merely prevent certain reorderings). The difficulty with thinking about this in terms of hardware, though, is that real systems are quite complex and have different behaviors. Most CPUs have several different flavors of memory barriers, for instance. If you want to reason about the JMM, you should think in terms of the model's elements: memory operations and synchronizations that constrain reorderings by establishing happens-before relationships.

So, to revisit this example in terms of the JMM, we see a write to the field x and a write to a field a in program order. There is nothing in this program that constraints reorderings, i.e. no synchronization, no operations on volatiles, no writes to final fields. There is no happens-before relationship between these writes, and therefore they can be reordered.

There are a couple ways to prevent these reorderings.

One way is to make x final. This works because the JMM says that writes to final fields before the constructor returns happen-before operations that occur after the constructor returns. Since a is written after the constructor returns, the initialization of the final field x happens-before the write to a, and no reordering is allowed.

Another way is to use synchronization. Suppose the MyInt instance were used in another class like this:

class OtherObj {
    MyInt a;
    synchronized void set() {
        a = new MyInt(42);
    }
    synchronized int get() {
        return (a != null) ? a.getValue() : -1;
    }
}

The unlock at the end of the set() call occurs after the writes to the x and the a fields. If another thread calls get(), it takes a lock at the beginning of the call. This establishes a happens-before relationship between the lock's release at the end of set() and the lock's acquisition at the beginning of get(). This means that the writes to x and a cannot be reordered after the beginning of the get() call. Thus the reader thread will see valid values for both a and x and can never find a non-null a and an uninitialized x.

Of course if the reader thread calls get() earlier, it may see a as being null, but there is no memory model issue here.

Your Foo and Bar examples are interesting and your assessment is essentially correct. Writes to array elements that occur before assignment to a final array field cannot be reordered after. Writes to array elements that occur after the assignment to the final array field may be reordered with respect to other memory operations that occur later, so other threads may indeed see out-of-date values.

In the comments you had asked about whether this is an issue with String since it has a final field array containing its characters. Yes, it is an issue, but if you look at the String.java constructors, they are all very careful to make the assignment to the final field at the very end of the constructor. This ensures proper visibility of the contents of the array.

And yes, this is subtle. :-) But the problems only really occur if you try to be clever, like trying to avoid using synchronization or volatile variables. Most of the time doing this isn't worth it. If you adhere to "safe publication" practices, including not leaking this during the constructor call, and storing references to constructed objects using synchronization (such as my OtherObj example above), things will work exactly as you expect them to.

References:

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
4

In sense of the Java Memory Model - yes. In doesn't mean that you will observe it in practice, though.

Look at it from the following angle: optimizations that may result in visible reordering may happen not only in the compiler, but also in the CPU. But the CPU doesn't know anything about objects and their constructors, for the processor it's just a pair of assignments that can be reordered if the CPU's memory model allows it.

Of course, compiler and JVM may instruct the CPU not to reorder these assignments by placing memory barriers in the generated code, but doing so for all objects will ruin performance of the CPUs that may heavily rely on such an aggressive optimizations. That's why Java Memory Model doesn't provide any special guarantees for this case.

This leads, for example, to well-known flaw in Double checked locking singleton implementation under the Java Memory Model.

axtavt
  • 239,438
  • 41
  • 511
  • 482
  • Are there any cases where it is observed in practice? In other words, do I need to put a memory barrier of I want the guarantee that a call to `a.getValue()` immediately following the quoted code will return `42`? – Miserable Variable Jul 16 '14 at 20:01
  • 1
    @MiserableVariable: There are claims that some very old JVM implementation actually did such a reorder in a compiler. Regarding CPU optimizations, as far as I know x86 memory model doesn't allow any reorder in this case, therefore dobule checking locking on x86 is safe. Though I think that it's not a good idea to violate memory model of the language anyway. – axtavt Jul 16 '14 at 20:04
  • 1
    My understanding of the infamous double-checked locking is that it was broken prior to 1.5, at which point it actually started working thanks to the new guarantees provided by the `volatile` keyword. But it's still considered bad practice for being too clever and not really gaining much, if anything. – Kevin Krumwiede Jul 16 '14 at 20:09
  • @KevinKrumwiede: That's the different question. Prior to 1.5 double-checked locking was incorrect even with `volatile`. From 1.5 onwards double-checked locking with `volatile` is correct, but it's still unsafe without it (in sense of JMM, not in practice) – axtavt Jul 16 '14 at 20:12
  • So if the answer to my original question is "yes", wouldn't this have serious implications for classes like `String`, since the elements of an array cannot be made final? The article suggests that making `x` final would eliminate the issue. Would making `a` final have the same effect? – Kevin Krumwiede Jul 16 '14 at 20:14
  • In other words, would making `x` final make any difference if `x` were an array? And if not, what can be done about it? – Kevin Krumwiede Jul 16 '14 at 20:22
  • 1
    `final` guarantees that results of the actions that happened before initialization of a `final` field will be visible to other threads if they access these results by dereferencing the `final` field - therefore yes. – axtavt Jul 16 '14 at 20:27
1

In other words, the claim is that in the code below, another thread could read a non-null value of a in which the value of x has not been set.

Short answer is yes.

Long answer : The point that underpins another thread reading a non-null a with a value of x that has not been set - is not strictly instruction re-ordering , but processor caching the values in its registers ( and L1 caches ) rather than reading these values from main memory. This may indirectly imply reordering but it is not necessary.

While the caching of values in CPU registers helps speed up processing , it introduces the problem of visibility of values between different threads running on different CPUs. If the values were always read from the main program area , all threads would consistently see the same value ( because there is one copy of that value ) . In your example code , if the value of the member field x is cached into a register of CPU1 which is accessed by thread-1 , and another thread Thread-2 running on CPU-2 now reads this value from main memory and updates it , the value of this as cached in CPU-1 ( processed by Thread-1 ) is now invalid from program point of view , but Java specification itself allows virtual machines to treat this as a valid scenario.

Bhaskar
  • 7,443
  • 5
  • 39
  • 51
  • If I understand your answer correctly, I think you're answering the wrong question. Note that there is no setter in `MyInt`. Once set by the constructor, the value of `x` will never be updated by any thread. It might as well be final, but the article claims that the fact that it isn't final creates a potential problem. I still don't understand how Thread-2 in your explanation could read a non-null value of `a` **before** the value of `x` in the object referenced by `a` is set to 42. – Kevin Krumwiede Jul 16 '14 at 20:42
  • Your confusion is valid - you are not updating the value of `x` after it is set in its constructor. But imagine - referring to the threads in my answer - that the constructor is called on Thread-2 ( which updates the value from its default ( 0 for int type ) to `value ` ) and Thread-1 calls the getter on the reference `a`. Do you see the problem ? – Bhaskar Jul 16 '14 at 20:47
  • I may be even more confused. Did you just reverse the threads in your comment vs. your answer? – Kevin Krumwiede Jul 16 '14 at 20:51
  • If Thread-1 updates the value of `x` from its default value of 0, *then* writes the object reference to `a`, I don't understand how Thread-2 can obtain a non-null `a` that refers to an object in which `x` is not set. Unless... the CPU reorders writing `x` to main memory with respect to writing `a` to main memory. After `a` is written to main memory but before `x` is written to main memory, Thread-2 reads and caches a copy of (the object referenced by) `a`. Is that it? – Kevin Krumwiede Jul 16 '14 at 20:55
  • No. Not reversed. Thread-1 is a reading thread , Thread-2 is an update thread. This is same in both places. It is not important which particular thread does particularly what - except that there are two threads - one reads and one updates. The thread that updates does the update on its own copy , and the thread that reads reads its own copy.Because there is no guarantee of when the exact update and the exact read will happen ( with respect to each other ) ther is always a chance that these are not consistent. HTH. – Bhaskar Jul 16 '14 at 20:57
  • "If Thread-1 updates the value of x from its default value of 0, **then** writes the object reference to a" - You are assuming the existence of happens-before relation here ( look at your **then** ) . This is not guaranteed unless you use primitives such as `final` on the field `x`. – Bhaskar Jul 16 '14 at 21:03
  • You can always assume a happens-before relationship *within the same thread*. – Kevin Krumwiede Jul 16 '14 at 23:46
  • Within **a** thread yes - but these two actions dont necessarily appear in that precise way to a thread that accesses the reference `a` - for this thread - it is quite legitimate that when it sees a non-null ref `a` , the action of updating `x` to `value` has not happened yet. In a way, this implies instruction re-ordering in the way you mentioned in your earlier comment. – Bhaskar Jul 17 '14 at 06:26