5

It appears the Java Memory Model does not define "refreshing" and "flushing" of the local cache, instead people only call it that way for simplicity, but actually the "happens-before" relationship implies refreshing and flushing somehow (would be great if you can explain that, but not directly part of the question).

This is getting me really confused combined with the fact that the section about the Java Memory Model in the JLS is not written in a way which makes it easy to understand.

Therefore could you please tell me if the assumptions I made in the following code are correct and if it is therefore guaranteed to run correctly?

It is partially based on the code provided in the Wikipedia article on Double-checked locking, however there the author used a wrapper class (FinalWrapper), but the reason for this is not entirely obvious to me. Maybe to support null values?

public class Memoized<T> {
    private T value;
    private volatile boolean _volatile;
    private final Supplier<T> supplier;

    public Memoized(Supplier<T> supplier) {
        this.supplier = supplier;
    }

    public T get() {
        /* Apparently have to use local variable here, otherwise return might use older value
         * see https://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html
         */
        T tempValue = value;

        if (tempValue == null) {
            // Refresh
            if (_volatile);
            tempValue = value;

            if (tempValue == null) {
                // Entering refreshes, or have to use `if (_volatile)` again?
                synchronized (this) {
                    tempValue = value;

                    if (tempValue == null) {
                        value = tempValue = supplier.get();
                    }

                    /* 
                     * Exit should flush changes
                     * "Flushing" does not actually exists, maybe have to use  
                     * `_volatile = true` instead to establish happens-before?
                     */
                }
            }
        }

        return tempValue;
    }
}

Also I have read that the constructor call can be inlined and reordered resulting in a reference to an uninitialized object (see this comment on a blog). Is it then safe to directly assign the result of the supplier or does this have to be done in two steps?

value = tempValue = supplier.get();

Two steps:

tempValue = supplier.get();
// Reorder barrier, maybe not needed?
if (_volatile);
value = tempValue;

Edit: The title of this question is a little bit misleading, the goal was to have reduced usage of a volatile field. If the initialized value is already in the cache of a thread, then value is directly accessed without the need to look in the main memory again.

Marcono1234
  • 5,856
  • 1
  • 25
  • 43
  • Your volatile is useless since you do not assign value to it. "Flush" happens at the end of `synchronized` block because according to JMM releasing of a lock happens before subsequent acquiring of the same lock – Ivan Jan 08 '19 at 21:34
  • 4
    I suggest reading this in entirety: https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/ – Aleksey Shipilev Jan 08 '19 at 22:02
  • 1
    The double-checked locking example with `final` is briefly explained [here on Stack Overflow](https://stackoverflow.com/a/30048680/2891664) (also by Aleksey Shipilev). – Radiodef Jan 08 '19 at 22:44
  • "Your volatile is useless since you do not assign value to it." @Ivan, I am using it to create happens-before relationships, not to store any value in it. – Marcono1234 Jan 08 '19 at 23:05
  • 1
    Happens before I'd established between write to volatile and subsequent read. You do not writeanything to that variable. – Ivan Jan 08 '19 at 23:19

2 Answers2

5

You can reduce usage of volatile if you have only a few singletons. Note: you have to repeat this code for each singleton.

enum LazyX {
   ;
   static volatile Supplier<X> xSupplier; // set somewhere before use

   static class Holder {
       static final X x = xSupplier.get();
   }

   public static X get() {
       return Holder.x;
   }
}

If you know the Supplier, this becomes simpler

enum LazyXpensive {
   ;

   // called only once in a thread safe manner
   static final Xpensive x = new Xpensive();

   // after class initialisation, this is a non volatile read
   public static Xpensive get() {
       return x;
   }
}

You can avoid making the field volatile by using Unsafe

import sun.misc.Unsafe;

import java.lang.reflect.Field;
import java.util.function.Supplier;

public class LazyHolder<T> {
    static final Unsafe unsafe = getUnsafe();
    static final long valueOffset = getValueOffset();

    Supplier<T> supplier;
    T value;

    public T get() {
        T value = this.value;
        if (value != null) return value;

        return getOrCreate();
    }

    private T getOrCreate() {
        T value;
        value = (T) unsafe.getObjectVolatile(this, valueOffset);
        if (value != null) return value;

        synchronized (this) {
            value = this.value;
            if (value != null) return value;
            this.value = supplier.get();
            supplier = null;
            return this.value;
        }
    }


    public static Unsafe getUnsafe() {
        try {
            Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
            theUnsafe.setAccessible(true);
            return (Unsafe) theUnsafe.get(null);
        } catch (NoSuchFieldException | IllegalAccessException e) {
            throw new AssertionError(e);
        }
    }

    private static long getValueOffset() {
        try {
            return unsafe.objectFieldOffset(LazyHolder.class.getDeclaredField("value"));
        } catch (NoSuchFieldException e) {
            throw new AssertionError(e);
        }
    }
}

However, having the extra look up is a micro optimisation. If you are willing to take a synchronisation hit once per thread, you can avoid using volatile at all.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
3

Your code is not thread safe, which can easily be shown by stripping off all irrelevant parts:

public class Memoized<T> {
    private T value;
    // irrelevant parts omitted

    public T get() {
        T tempValue = value;

        if (tempValue == null) {
            // irrelevant parts omitted
        }

        return tempValue;
    }
}

So value has no volatile modifier and you’re reading it within the get() method without synchronization and when non-null, proceed using it without any synchronization.

This code path alone is already making the code broken, regardless of what you are doing when assigning value, as all thread safe constructs require both ends, reading and writing sides, to use a compatible synchronization mechanism.

The fact that you are using esoteric constructs like if (_volatile); becomes irrelevant then, as the code is already broken.

The reason why the Wikipedia example uses a wrapper with a final field is that immutable objects using only final fields are immune to data races and hence, the only construct that is safe when reading its reference without a synchronization action.

Note that since lambda expressions fall into the same category, you can use them to simplify the example for your use case:

public class Memoized<T> {
    private boolean initialized;
    private Supplier<T> supplier;

    public Memoized(Supplier<T> supplier) {
        this.supplier = () -> {
            synchronized(this) {
                if(!initialized) {
                    T value = supplier.get();
                    this.supplier = () -> value;
                    initialized = true;
                }
            }
            return this.supplier.get();
        };
    }

    public T get() {
        return supplier.get();
    }
}

Here, supplier.get() within Memoized.get() may read an updated value of supplier without synchronization action, in which case it will read the correct value, because it is implicitly final. If the method reads an outdated value for the supplier reference, it will end up at the synchronized(this) block which uses the initialized flag to determine whether the evaluation of the original supplier is necessary.

Since the initialized field will only be accessed within the synchronized(this) block, it will always evaluate to the correct value. This block will be executed at most once for every thread, whereas only the first one will evaluate get() on the original supplier. Afterwards, each thread will use the () -> value supplier, returning the value without needing any synchronization actions.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • And if the outdated `supplier` is read and the `synchronized(this)` is entered and exited the new `supplier` will be read due to the happens-before guarantees of a synchronized block, correct? Do you have a source for lambdas being safely published (if possible JLS)? – Marcono1234 Jun 02 '19 at 10:10
  • Do you know how well this approach performs compared to the other solutions? Aleksey Shipilёv made [performance tests](https://shipilev.net/blog/2014/safe-public-construction/#_performance), but this does not include lambdas. – Marcono1234 Jun 02 '19 at 10:12
  • Also I am going to mark this answer as accepted because it points out what is wrong with my solution, though Peter Lawrey provided great alternative solutions. – Marcono1234 Jun 02 '19 at 10:14
  • 1
    Yes, the `synchronized(this)` provides the necessary happens-before relationship. The safety of the captured values for lambda expressions is not mentioned explicitly, but can be derived from the requirement for the local variables to be `final` or effectively final and definitely assigned before the lambda expression (per JLS§15.27.2.). In practice, these values are copied to `final` fields of the generated class (and any hypothetical future mechanism not needing classes would have to stay compatible with that). Performance is expected to be between the Final Wrapper and the Holder approach. – Holger Jun 03 '19 at 09:46