0

I'm using Java to write a multithreading application. One question I have is I have a list that is accessed by multiple threads, and I have one thread trying to update it. However, each time, the update thread will create a new List and then make the public shared list point to the new List, like this:

Public List<DataObject> publicDataObject = XXXX; // <- this will be accessed by multiple threads

Then I have one thread updating this List:

List<DataObject> newDataObjectList = CreateNewDataObject();
publicDataObject = newDataObjectList;

When I update the pointer of publicDataObject, do I need a lock to make it thread-safe?

HoRn
  • 1,458
  • 5
  • 20
  • 25
Wei
  • 133
  • 1
  • 4
  • 3
    Some sort of memory barrier is requird to ensure that the contents of newDataObjectList are published to memory before the pointer is made visible. – Raymond Chen Oct 11 '21 at 20:35

1 Answers1

2

Before going into the answer, let's first check if I understand the situation and the question.

  1. Assumption as stated in problem description:
    Only a single thread creates new versions of the list, based on the previous value of publicDataObject, and stores that new list in publicDataObject.
  2. Derived assumption:
    Other threads access DataObject elements from that list, but do not add, remove or change the order of elements.
    • If this assumption holds, the answer is below.
    • Otherwise, please make sure your question includes this in its description. This makes the answer much more complex, though, and I advise you to study the topic of concurrency more, for example by reading a book about Java concurrency.
  3. Additional assumption:
    The DataObject objects themselves are thread-safe.
    • If this assumption does not hold, this would make the scope of the question too broad and I would suggest to study the topic of concurrency more, for example by reading a book about Java concurrency.

Answer

Given that the above assumptions are true, you do not need a lock, however, you cannot just access publicDataObject from multiple threads, using its definition in you code example. The reason is the Java Memory Model. The Java Memory Model makes no guarantees whatsoever about threads seeing changes made by other threads, unless you use special language constructs like atomic variables, locks or synchronization.

Simplified, these constructs ensure that a read in one thread that happens after a write in another, can see that written value, as long as you are using the same construct: the same atomic variable, lock or synchronisation on the same object. Locks and intrinsic locks (used by synchronisation) can also ensure exclusive access of a single thread to a block of code.

Given, again, that the above assumptions are true, you can suffice using an AtomicReference, whose get and set methods have the desired relationship:

// Definition
public AtomicReference<List<DataObject>> publicDataObject;

The reasons that a simple construct can be used that "only" guarantees visibility are:

  • The list that publicDataObject refers to, is always a new one (the first assumption). Therefore, other threads can never see a stale value of the list itself. Making sure that threads see the correct value of publicDataObject is therefore enough
  • As long as other threads don't change the list.
  • If in addition, only thread sets publicDataObject, there can be no race conditions in setting it, for example loosing updates that are overwritten by more recent updates before ever being read.
Emmef
  • 500
  • 2
  • 7
  • 2
    A volatile variable would be sufficient. It will give exactly the same semantics as an AtomicReference (which uses an volatile variable under the hood). – pveentjer Oct 12 '21 at 02:22
  • @pveentjer true. I _personally_ prefer to use atomics for less experienced people. Though it does not apply to this situation, once people get used to volatiles in general, it is easy to start doing stuff like `++volatileVar` and run into "unexpected" problems. It is true that one can and should learn that it is not semantically the same as `atomicInteger.incrementAndGet()`. However, in a big application with many developers, this is not the type of things you want to have to troubleshoot. And explaining you should not do that in a case that has nothing to do with the question seems wrong. – Emmef Oct 12 '21 at 15:06