15

The other day Howard Lewis Ship posted a blog entry called "Things I Learned at Hacker Bed and Breakfast", one of the bullet points is:

A Java instance field that is assigned exactly once via lazy initialization does not have to be synchronized or volatile (as long as you can accept race conditions across threads to assign to the field); this is from Rich Hickey

On the face of it this seems at odds with the accepted wisdom about visibility of changes to memory across threads, and if this is covered in the Java Concurrency in Practice book or in the Java language spec then I have missed it. But this was something HLS got from Rich Hickey at an event where Brian Goetz was present, so it would seem there must be something to it. Could someone please explain the logic behind this statement?

Nathan Hughes
  • 94,330
  • 19
  • 181
  • 276
  • don't fear volatile reads. class initialization, i.e. modifiable code is the only portable way to do it w/o volatile. The statement is incorrect in the face of CPU architecture that allows writes to be reordered. On x86 and Sparc TSO volatile read is free, so no point to play a hacker. – bestsss Jun 17 '12 at 23:09

4 Answers4

9

This statement sounds a little bit cryptic. However, I guess HLS refers to the case when you lazily initialize an instance field and don't care if several threads performs this initialization more than once.
As an example, I can point to the hashCode() method of String class:

private int hashCode;

public int hashCode() {
    int hash = hashCode;
    if (hash == 0) {
        if (count == 0) {
            return 0;
        }
        final int end = count + offset;
        final char[] chars = value;
        for (int i = offset; i < end; ++i) {
            hash = 31*hash + chars[i];
        }
        hashCode = hash;
    }
    return hash;
}

As you can see access to the hashCode field (which holds cached value of the computed String hash) is not synchronized and the field isn't declared as volatile. Any thread which calls hashCode() method will still receive the same value, though hashCode field may be written more than once by different threads.

This technique has limited usability. IMHO it's usable mostly for the cases like in the example: a cached primitive/immutable object which is computed from the others final/immutable fields, but its computation in the constructor is an overkill.

Volo
  • 28,673
  • 12
  • 97
  • 125
  • 1
    This only works for 32bit primitive values, right ? If `hashCode` was a `long`, than it would need to be marked as `volatile`, according to http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html – ygor Nov 28 '18 at 09:45
  • @ygor In general, you're right. Assignment to `long` is not guaranteed to be atomic in Java. Even though it is performed atomically in 64-bit JVMs: https://stackoverflow.com/questions/517532/writing-long-and-double-is-not-atomic-in-java#comment79586839_1191390 – Volo Dec 04 '18 at 00:10
6

Hrm. As I read this it is technically incorrect but okay in practice with some caveats. Only final fields can safely be initialized once and accessed in multiple threads without synchronization.

Lazy initialized threads can suffer from synchronization issues in a number of ways. For example, you can have constructor race conditions where the reference of the class has been exported without the class itself being initialized fully.

I think it highly depends on whether or not you have a primitive field or an object. Primitive fields that can be initialized multiple times where you don't mind that multiple threads do the initialization would work fine. However HashMap style initialization in this manner may be problematic. Even long values on some architectures may store the different words in multiple operations so may export half of the value although I suspect that a long would never cross a memory page so therefore it would never happen.

I think it depends highly on whether or not an application has any memory barriers -- any synchronized blocks or access to volatile fields. The devil is certainly in the details here and the code that does the lazy initialization may work fine on one architecture with one set of code and not in a different thread model or with an application that synchronizes rarely.


Here's a good piece on final fields as a comparison:

http://www.javamex.com/tutorials/synchronization_final.shtml

As of Java 5, one particular use of the final keyword is a very important and often overlooked weapon in your concurrency armoury. Essentially, final can be used to make sure that when you construct an object, another thread accessing that object doesn't see that object in a partially-constructed state, as could otherwise happen. This is because when used as an attribute on the variables of an object, final has the following important characteristic as part of its definition:

Now, even if the field is marked final, if it is a class, you can modify the fields within the class. This is a different issue and you must still have synchronization for this.

Community
  • 1
  • 1
Gray
  • 115,027
  • 24
  • 293
  • 354
4

This works fine under some conditions.

  • its okay to try and set the field more than once.
  • its okay if individual threads see different values.

Often when you create an object which is not changed e.g. loading a Properties from disk, having more than one copy for a short amount of time is not an issue.

private static Properties prop = null;

public static Properties getProperties() {
    if (prop == null) {
        prop = new Properties();
        try {
            prop.load(new FileReader("my.properties"));
        } catch (IOException e) {
            throw new AssertionError(e);
        }
    }
    return prop;
}

In the short term this is less efficient than using locking, but in the long term it could be more efficient. (Although Properties has a lock of it own, but you get the idea ;)

IMHO, Its not a solution which works in all cases.

Perhaps the point is that you can use more relaxed memory consistency techniques in some cases.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 3
    This suffers from the constructor race condition issues however. You can get the reference to the object exported to another thread without the object being fully initialized. – Gray Jun 15 '12 at 13:57
  • @Gray my understanding of lazy-initialization is that it is always initialization as required. It shouldn't be possible to see an uninitalized value, but you could try to set it more than once. – Peter Lawrey Jun 15 '12 at 13:59
  • 1
    The issue as I see it @Peter is if there is no synchronization, no memory barrier then there is the possibility of part of an object being shared between memory caches without the _full_ object storage being updated. If thread A running on another process had its own copy of page #1 but not page #2 and an object that had storage in each page was lazy initialized by thread B, then thread A can get a partially initialized object. – Gray Jun 15 '12 at 14:04
  • @Gray I don't understand how `prop` could be partially intialised for example. – Peter Lawrey Jun 15 '12 at 14:09
  • 1
    I think I'm splitting hairs here @Peter but as I understood it, the problem is with cached memory pages. If thread A has it's own copy of page #1 because it's changed something locally in that page but _not_ it's own copy of page #2. If `prop` crossed both memory pages and has been initialized by thread B then when thread A accesses `prop` unsynchronized for the first time, it is going to load page #2 into its memory but it won't refresh page #1 and might get half of the initialized `prop` object. – Gray Jun 15 '12 at 14:13
  • 1
    @Gray is correct. On architectures where writes can be reordered (Alpha/Itanium) it's possible to 'see' half initialized `Properties` object - like `Entry[] table` being `null`. This was very prime reason double locking is not working. Always use volatile - reads are just cheap or even free... So yeah, it's not splitting hair if the underlying architecture is 'weak' enough, in order to see a properly initialized object a write-write barrier has to be issued (write-write are cheap, though) – bestsss Jun 17 '12 at 23:03
4

I think the statement is untrue. Another thread can see a partially initialized object, so the reference can be visible to another thread even though the constructor hasn't finished running. This is covered in Java Concurrency in Practice, section 3.5.1:

public class Holder {

    private int n;

    public Holder (int n ) { this.n = n; }

    public void assertSanity() {
        if (n != n)
            throw new AssertionError("This statement is false.");
    }

}

This class isn't thread-safe.

If the visible object is immutable, then I you are OK, because of the semantics of final fields means you won't see them until its constructor has finished running (section 3.5.2).

artbristol
  • 32,010
  • 5
  • 70
  • 103