9

What does it mean for a reference to be assigned atomically in Java?

  • I do understand what it means for a long and double, that is: a thread can see partially constructed number,
  • but for an object I don't understand since assignment does not mean copy just pointing to an address in memory

So what could have been wrong if reference assignment was not atomic in Java?

Chetan Kinger
  • 15,069
  • 6
  • 45
  • 82
user1409534
  • 2,140
  • 4
  • 27
  • 33

3 Answers3

13

This means that you will not get the corrupted reference ever. Suppose that you have the following class:

class MyClass {
    Object obj = null;
}

In memory obj is a null pointer, usually it's an integer number like 0x00000000. Then suppose that in one thread you have an assignment:

this.obj = new Object();

Suppose that new Object() is allocated in the memory and has the pointer like 0x12345678. The reference atomicity ensures that when you check the obj from another thread you will either have a null pointer (0x00000000) or pointer to the new object (0x12345678). But under no circumstances you can get the partially assigned reference (like 0x12340000) which points to nowhere.

This might look obvious, but such problem may appear in low-level languages like C depending on the CPU architecture and memory alignment. For example if your pointer is misaligned and crosses the cache line, it's possible that it will not be synchronously updated. In order to avoid such situation Java virtual machine always aligns pointers, so they never cross the cache line.

So were the Java references non-atomic, there would be a chance when dereferencing the reference written from another thread that you get not the object which was referenced before or after the assignment, but random memory location (which may lead to segmentation fault, corrupted heap or any other disaster).

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
  • *partially constructed reference*. I find this line to be a bit incorrect. – Chetan Kinger Jun 25 '15 at 07:02
  • @ChetanKinger: is it better now? – Tagir Valeev Jun 25 '15 at 07:05
  • 1
    What I was trying to point is that the reason why we need an atomic reference is to avoid threads from using a reference to a partially constructed object. Instead of *partially constructed reference*, your answer should read *reference to a partially constructed object*. (In my opinion) – Chetan Kinger Jun 25 '15 at 07:08
  • @ChetanKinger, OPs question is not about `AtomicReference` class, it's about Java Memory Model. It's not about the partially constructed objects, it's about references themselves which can be stored non-atomically. – Tagir Valeev Jun 25 '15 at 07:10
  • 1
    From what I understand, a pointer on 64bit systems is usually a long (8 bytes) and not an integer. Anyway, +1 for a clear explanation and for context I found the following answers helpful: http://stackoverflow.com/a/11964034/3080094 and http://stackoverflow.com/a/15201349/3080094 – vanOekel Jun 25 '15 at 08:34
  • @vanOekel, I used 32 bits just as illustration. Actually Hotspot JVM often uses 32bit references even on 64bit systems: if you set -Xmx lower than 4Gb, then all references will fit the 32bits and if you set -Xmx lower than 32Gb, there's a "compressed oops" feature which also allows you to fit the pointers in 32bits. Exact pointer length is not specified by JVM specification and is up to implementation (I wonder if it's useful to have 48bit pointers?..) – Tagir Valeev Jun 25 '15 at 08:38
  • 2
    If you could get a corrupted reference, that would be a gaping security hole. – user253751 Jun 25 '15 at 08:51
10

Let's consider the classic double checked locking example to understand why a reference needs to be atomic :

class Foo {
    private Helper result;
    public static Helper getHelper() {
        if (result == null) {//1
            synchronized(Foo.class) {//2
               if (result == null) {//3
                    result = new Helper();//4
                }
            }
        }
        return result//5;
    }

    // other functions and members...
}

Let's consider 2 threads that are going to call the getHelper method :

  1. Thread-1 executes line number 1 and finds result to be null.
  2. Thread-1 acquires a class level lock on line number 2
  3. Thread-1 finds result to be null on line number 3
  4. Thread-1 starts instantiating a new Helper
  5. While Thread-1 is still instantiating a new Helper on line number 4, Thread-2 executes line number 1.

Steps 4 and 5 is where an inconsistency can arise. There is a possibility that at Step 4, the object is not completely instantiated but the result variable already has the address of the partially created Helper object stamped into it. If Step-5 executes even a nanosecond before the Helper object is fully initialized,Thread-2 will see that result reference is not null and may return a reference to a partially created object.

A way to fix the issue is to mark result as volatile or use a AtomicReference. That being said, the above scenario is highly unlikely to occur in the real world and there are better ways to implement a Singleton than using double-checked locking.

Here's an example of implementing double-checked locking using AtomicReference :

private static AtomicReference instance = new AtomicReference();

public static AtomicReferenceSingleton getDefault() {
     AtomicReferenceSingleton ars = instance.get();
     if (ars == null) {
         instance.compareAndSet(null,new AtomicReferenceSingleton());
         ars = instance.get();
     }
     return ars;
}

If you are interested in knowing why Step 5 can result in memory inconsistencies, take a look at this answer (as suggested by pwes in the comments)

Chetan Kinger
  • 15,069
  • 6
  • 45
  • 82
  • @ChetanKinger - **A way to fix the issue is to mark result as volatile or use a AtomicReference** but volatile only has relevance to modifications of the variable itself, not the object it refers to right? – Nirmal Jun 25 '15 at 07:24
  • 1
    @user3320018 AtomicReference's code uses `volatile` internally for the referenced object. It has nothing to do with the object's contents, which must do the same where required, or use proper synchronization – pwes Jun 25 '15 at 07:28
  • Ok cool. Another question - so the reference (not concrete object on heap) assignment itself needs two operations, am I right? – Nirmal Jun 25 '15 at 07:33
  • @user3320018 As far as my understanding goes, the creation of a memory location and it's initialization is definitely a multiple step process and that's where the inconsistency can arise. – Chetan Kinger Jun 25 '15 at 08:01
5

I'm assuming you are asking about AtomicReference<V>.

The idea is that if two or more threads read or update the value of a variable of reference type, you might get unexpected results. For example, suppose each thread checks if some reference type variable is null, and if it's null, creates an instance of that type and updates that reference variable.

This may cause two instances to be created if both threads see that the variable is null at the same time. If your code relies on all threads working with the same instance referred by that variable, you'll get in trouble.

Now, if you use AtomicReference<V>, you can solve this problem by using the compareAndSet(V expect, V update) method. So a thread will update the variable only if some other thread didn't beat it to it.

For example :

static AtomicReference<MyClass> ref = new AtomicReference<> ();

... 
// code of some thread
MyClass obj = ref.get();
if (obj == null) {
    obj = new MyClass();
    if (!ref.compareAndSet (null, obj)) // try to set the atomic reference to a new value
                                        // only if it's still null
        obj = ref.get(); // if some other thread managed to set it before the current thread,
                         // get the instance created by that other thread
}
Eran
  • 387,369
  • 54
  • 702
  • 768
  • 1
    This is not perfect, as you might create two `MyClass` objects, which may be undesireable. Plain synchronized inside would help. – pwes Jun 25 '15 at 07:41
  • Indeed, the classic singleton example is not a good way to show this. – Kayaman Jun 25 '15 at 07:43
  • @pwes Yes, a redundant MyClass instance may be created by this code, but it won't be assigned to the atomic reference, and all threads will use the same instance. I agree it's not a perfect example. – Eran Jun 25 '15 at 07:46