10

I have this class where I cache instances and clone them (Data is mutable) when I use them.

I wonder if I can face a reordering issue with this.

I've had a look at this answer and JLS but I am still not confident.

public class DataWrapper {
    private static final ConcurrentMap<String, DataWrapper> map = new ConcurrentHashMap<>();
    private Data data;
    private String name;

    public static DataWrapper getInstance(String name) {
        DataWrapper instance = map.get(name);
        if (instance == null) {
            instance = new DataWrapper(name);
        }
        return instance.cloneInstance();
    }

    private DataWrapper(String name) {
        this.name = name;
        this.data = loadData(name);  // A heavy method
        map.put(name, this);  // I know
    }

    private DataWrapper cloneInstance() {
        return new DataWrapper(this);
    }

    private DataWrapper(DataWrapper that) {
        this.name = that.name;
        this.data = that.data.cloneInstance();
    }
}

My thinking: The runtime can reorder the statement in constructor and publish the current DataWrapper instance (put in the map) before initializing the data object. A second thread reads the DataWrapper instance from map and sees null data field (or partially constructed).

Is this possible? If yes, Is it only due to reference escape?

If not, could you please explain me how to reason about the happens-before consistency in simpler terms?

What If I did this:

public class DataWrapper {
    ...
    public static DataWrapper getInstance(String name) {
        DataWrapper instance = map.get(name);
        if (instance == null) {
            instance = new DataWrapper(name);
            map.put(name, instance);
        }
        return instance.cloneInstance();
    }
    
    private DataWrapper(String name) {
        this.name = name;
        this.data = loadData(name);     // A heavy method
    }
    ...
}

Is it still prone to the same issue?

Please note that I don't mind if there are one or two extra instances created if multiple threads try to create and put the instance for the same value at the same time.

EDIT:

What if the name and data fields were final or volatile?

public class DataWrapper {
    private static final ConcurrentMap<String, DataWrapper> map = new ConcurrentHashMap<>();
    private final Data data;
    private final String name;
    ... 
    private DataWrapper(String name) {
        this.name = name;
        this.data = loadData(name);  // A heavy method
        map.put(name, this);  // I know
    }
    ...
}

Is it still unsafe? As far as I understand, the constructor initialization safety guarantees only applies if the reference hasn't escaped during initialization. I am looking for official sources that confirm this.

Community
  • 1
  • 1
Gurwinder Singh
  • 38,557
  • 6
  • 51
  • 76
  • 1
    From my reading of the question you link to - yes, you may have a reordering issue because and see a partially initialized instance via reference escape. Not in the other case when you've moved `map.put` out of the constructor. But I'm not sure, so I'm looking forward for an authoritative answer. – lexicore Oct 05 '17 at 19:15
  • 1
    @GurV Here is my answer https://stackoverflow.com/questions/35883354/is-there-any-instruction-reordering-done-by-the-hotspot-jit-compiler-that-can-be/35903528#35903528 on the same question. Take into account that final and volatile have memory fence(barrier) semantic – Ivan Mamontov Oct 14 '17 at 09:37

2 Answers2

5

The implementation has some very subtle caveats.

It seems you are aware, but just to be clear, in this piece of code multiple threads may get a null instance and enter the if block, unnecessarily creating new DataWrapper instances:

public static DataWrapper getInstance(String name) {
    DataWrapper instance = map.get(name);
    if (instance == null) {
        instance = new DataWrapper(name);
    }
    return instance.cloneInstance();
}

It seems you are ok with that, but that requires the assumption that loadData(name) (used by DataWrapper(String)) will always return the same value. If it may return different values depending on the timing, there's no guarantee that the last thread to load the data will store it in the map, so the value may be outdated. If you say this won't happen or it's not important, that's fine, but this assumption should be at least documented.

To demonstrate another subtle issue, let me inline the instance.cloneInstance() method:

public static DataWrapper getInstance(String name) {
    DataWrapper instance = map.get(name);
    if (instance == null) {
        instance = new DataWrapper(name);
    }
    return new DataWrapper(instance);
}

The subtle issue here is that this return statement is not safe publication. The new DataWrapper instance may be partially constructed, and a thread may observe it in an inconsistent state, for example the fields of the object may not be set yet.

There is a simple fix for this: if you make the name and data fields final, the class becomes immutable. Immutable classes enjoy special initialization guarantees, and the return new DataWrapper(this); becomes safe publication.

With this simple change, and assuming that you're ok with the first point (loadData is not time-sensitive), I think the implementation should work correctly.


I would recommend an additional improvement that is not related to correctness, but other good practices. The current implementation has too many responsibilities: it's a wrapper around Data, and at the same time a cache. The added responsibility makes it a bit confusing to read. And as an aside, the concurrent hash map is not really used to its potential.

If you separate the responsibilities, the result can be simpler, better, and easier to read:

class DataWrapperCache {

  private static final ConcurrentMap<String, DataWrapper> map = new ConcurrentHashMap<>();

  public static DataWrapper get(String name) {
    return map.computeIfAbsent(name, DataWrapper::new).defensiveCopy();
  }
}

class DataWrapper {

  private final String name;
  private final Data data;

  DataWrapper(String name) {
    this.name = name;
    this.data = loadData(name);  // A heavy method
  }

  private DataWrapper(DataWrapper that) {
    this.name = that.name;
    this.data = that.data.cloneInstance();
  }

  public DataWrapper defensiveCopy() {
    return new DataWrapper(this);
  }
}
janos
  • 120,954
  • 29
  • 226
  • 236
  • Thank you for the answer. I am not looking to create `DataWrapperCache`, but I can see that you moved the "put in map" part out of `DataWrapper` constructor. I want to understand the effect of it being (or not being) in the `DataWrapper` Constructor. Also, you are right about the non-final fields. But what if I made them `final` (or `volatile`) and kept the `map.put` call inside constructor as implied by the first part of your answer? Am I safe then? I am looking for some reference to official/reliable sources. – Gurwinder Singh Oct 08 '17 at 07:58
  • @GurV I did write at the end of my first point, that making the fields final should make your implementation safe. You don't need to create the `DataWrapperCache`, but I still recommend it. My arguments are based on my interpretation of the Java Concurrency in Practice book, especially Chapter 3. I don't have links to JLS documentation, because that's way too cryptic for me to read :-/ – janos Oct 08 '17 at 08:56
  • 1
    I cannot fully follow this answer. Also, `new DataWrapper(this)` will not compile from a `static` method. – Rafael Winterhalter Oct 09 '17 at 08:59
  • 1
    That was an oversight, I meant to write `return new DataWrapper(instance)` – janos Oct 12 '17 at 09:09
5

If you want to be spec-compliant, you cannot apply this constructor:

private DataWrapper(String name) {
  this.name = name;
  this.data = loadData(name);
  map.put(name, this);
}

As you point out, the JVM is allowed to reorder this to something like:

private DataWrapper(String name) {
  map.put(name, this);
  this.name = name;
  this.data = loadData(name);
}

When assigning a value to final field, this implies a so-called freeze action at the end of the constructor. The memory model guarantees a happens before relationship between this freeze action and any dereferenzation of the instance on which this freeze action was applied upon. This relationship does however only exist at the end of the constructor, therefore, you break this relationship. By dragging the publication out of your constructor, you can fix this realtionship.

If you want a more formal overview of this relationship, I recommend looking through this slide set. I also explained the relationship in this presentation starting at about minute 34.

Rafael Winterhalter
  • 42,759
  • 13
  • 108
  • 192
  • Thank you for the answer. Question - What if the fields were volatile instead of final? If I moved "map.put" outside the constructor in that case, is the code safe then? Also, Can you please share some references to any official sources that confirm this? Thank you. – Gurwinder Singh Oct 09 '17 at 10:21
  • With volatile fields, you would suffer the same problem. This is actually a recognized problem and something that is supposed to be fixed in JMM9 (http://openjdk.java.net/jeps/188) where a suggestion is to give the constructor freeze guarantees for any field, not just final ones. – Rafael Winterhalter Oct 09 '17 at 19:09
  • As a reference, have a look at 17.5.1 of the JMM (https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html). Unfortunately, it is not the nature of a spec to declare undefined behavior but to specify legal behavior and render anything outside of the spec being undefined. But from the mentioned section, it is quite clear that any other use of constructor fields yields the same racing conditions as any non-constructor reordering. – Rafael Winterhalter Oct 09 '17 at 19:12
  • I stumbled upon [this java docs page](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/package-summary.html), which says: **Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.**. I assume that won't help me unless I move the map.put outside constructor. Could you please confirm? – Gurwinder Singh Oct 14 '17 at 19:45
  • 1
    I completely oversaw that you had a concurrent map there. That map uses volatile values internally which imply happens before guarantees for the fields assigned prior to the store for any subsequent read. This has the same effect as the freeze action. – Rafael Winterhalter Oct 15 '17 at 12:36