0

I have a usecase where multiple threads can be reading and modifying an ArrayList where the default values for these booleans are True.

The only modification the threads can make is setting an element of that ArrayList from True to False.

All of the threads will be also reading the ArrayList concurrently, but it is okay to read staled values.

Note: The size of the ArrayList will not change throughout the lifetime of the ArrayList.

Question:

Is it necessary to synchronize the ArrayList across these threads? The only synchronization I'm doing is marking the ArrayList as volatile such that any update to it will be flushed back to the main memory from a thread's local memory. Is this enough?

Here is a sample code on how this ArrayList gets used by threads

myList is the ArrayList in question and its values are initialized to True

if (!myList.get(index)) {
   return;
} else {
  // do some operations to determine if we should
  // update the value of myList to False.
  if (needToUpdateList) {
      myList.set(index, False);
   }
}

Update

I previously said these threads do not care about staled values which is true. However, I have another thread that only reads these values and perform one final action. This thread does care about staleness. Does the synchronization requirement change?

Is there a cheaper way to "publish" the updated values besides requiring synchronization on every update? I'm trying to minimize locking as much as possible.

user1848861
  • 37
  • 1
  • 5
  • 4
    Volatile ensures visibility of the *variable*, not the object held by the variable. So if you're initialising this list once, and trying to use volatile to guarantee visibility of the elements, then it isn't doing what you think it is – Michael Oct 29 '18 at 16:02
  • Do you mind elaborating this more what you mean by "So if you're initialising this list once then volatile will not help you here at all" – user1848861 Oct 29 '18 at 16:04
  • Sure. It doesn't sound like you're ever changing the variable (i.e. `myList = someNewList`). The only thing volatile helps with is making sure `myList` on each thread points to the right thing. Because you're not reassigning the variable (I'm guessing), it's redundant – Michael Oct 29 '18 at 16:05
  • More on "initializing": it's part of a Java multi-threading pattern. `volatile` makes the field visible, and also makes all previous actions visible (not quite what Michael said). But it doesn't do anything for subsequent actions. C.f. Brian Goetz's **Java Concurrency in Practice** and this related question: https://stackoverflow.com/questions/801993/java-multi-threading-safe-publication – markspace Oct 29 '18 at 16:10
  • Please post the code for how a particular index of List is updated. For example, if the code sets an index to null, then later sets it to new a Boolean, then that's not going to be thread safe unless there is a null check everywhere. – Andrew S Oct 29 '18 at 16:11
  • Hi @AndrewS I have updated my post with an example of how a particular index of List gets updated. – user1848861 Oct 29 '18 at 16:16
  • Unrelated to synchronization, but your use of `List` suggests you want something like a bitfield. As such, List wastes space and time. Consider alternatives such as `BitSet`, `EnumSet` or just plain `int` (with bitwise operations). – DodgyCodeException Oct 29 '18 at 16:16
  • I suspect the OP actually does care about stale values. Otherwise `if (!myList.get(index)) {` is basically a random number generator. But strictly according to what the OP said, they don't care, so... *shrug* – markspace Oct 29 '18 at 16:24
  • @markspace You are right that I care about staleness :) The threads updating the list does not care about reading staled values (bc of the nature of the problem I'm solving), but I have another thread that does care about staleness. – user1848861 Oct 29 '18 at 16:29

4 Answers4

3

As it says in the Javadoc of ArrayList:

Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.

You're not modifying the list structurally, so you don't need to synchronize for that reason.

The other reason you'd want to synchronize is to avoid reading stale values; but you say that don't care about that.

As such there is no reason to synchronize.

Edit for the update #3

If you do care about reading stale values, you do need to synchronize.

An alternative to synchronization which would avoid locking the entire list would be to make it a List<AtomicBoolean>: this would not require synchronization of the list, because you aren't changing the values stored in the list; but reads of an AtomicBoolean value guarantees visibility.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
0

It depends on what you want to do when an element is true. Consider your code, with a separate thread messing with the value you're looking at:

if (!myList.get(index)) {    // <--- at this point, the value is True, so go to else branch
   return;
} else {
  // <--- at this point, other thread sets value to False.

  // do some operations to determine if we should
  // update the value of myList to False.

  // **** Do these operations assume the value is still True?
  // **** If so, then your application is not thread-safe.

  if (needToUpdateList) {
      myList.set(index, False);
   }
}
DodgyCodeException
  • 5,963
  • 3
  • 21
  • 42
  • This is my thought also, that the OP hasn't consider carefully what "stale values" really means. "Stale values" often means the code in question has no defined semantics (it could do anything at all, and is basically random). But the OP said they don't care, so ... – markspace Oct 29 '18 at 16:26
0

Update

I previously said these threads do not care about staled values which is true. However, I have another thread that only reads these values and perform one final action. This thread does care about staleness. Does the synchronization requirement change?

You just invalidated a lot of perfectly good answers.

Yes, synchronization matters now. In fact probably atomicity matters too. Use a synchronized List or maybe even a Concurrent list or map of some sort.

Anytime you read-modify-write the list, you probably also have to hold the synchronized lock to preserve atomicity:

synchronized( myList ) {
    if (!myList.get(index)) {
       return;
    } else {
      // do some operations to determine if we should
      // update the value of myList to False.
      if (needToUpdateList) {
          myList.set(index, False);
       }
    }
}

Edit: to reduce the time the lock is held, a lot depends on how long "do some operations" take, but a CocurrentHashMap reduces lock contention at the cost of some additional overhead. It might be worth profiling your code to determine the actual overhead and which method is faster/better.

ConcurrentHashMap<Integer,Boolean> myList = new ConcurrentHashMap<>();
//...

if( myList.get( index ) != null ) return;
   // "do some opertaions" here
if( needToUpdate )
   myList.put( index, false );

But I'm still not convinced that this isn't premature optimization. Write correct code first, fully synchronizing the list is probably fine. Then profile working code. If you do find a bottleneck, then you can worry about whether reducing lock contention is a good idea. But probably the code bottleneck won't be in the lock contention and will in fact be somewhere totally different.

Community
  • 1
  • 1
markspace
  • 10,621
  • 3
  • 25
  • 39
  • Hi, the locking version of the code is working correctly. I have done enough profiling in the past month to determine performing updates on the list is the bottleneck, and hence the post. Atomicity actually doesn't matter when doing updates, as long as the set operation from the list is atomic. I have currently reduced the problem to: 1 thread is doing the update. Another thread will be reading this updated value 10 minutes later. Is there a way to "publish" the updates w/o requiring locking on every single update? – user1848861 Oct 29 '18 at 16:45
  • Reading between the lines, I think I'd rather see you use a concurrent queue. Any job that needs further update just gets put in a queue, and then a thread reads the queue to determine what jobs to process further. But you really haven't given us much info. – markspace Oct 29 '18 at 16:47
  • You understand that I don't get notified when you edit a comment, right? @user1848861 – markspace Oct 29 '18 at 17:09
  • Hi @markspace Thanks for your response to my questions, and no I don't usually post questions in stackoverflow so the notification system is new to me. :) – user1848861 Oct 29 '18 at 17:28
0

I did some more googling and found that each thread might be storing the values in the registry or the local cache. The specs only offer some guarantee on when data would be written to shared/global memory.

https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5

basically volatile, synchronized, thread.start(), thread.join()...

So yeah using the AtmoicBoolean will probably be the easiest, but you can also synchronize or make a class with a volatile boolean in it.

check this link out: http://tutorials.jenkov.com/java-concurrency/volatile.html#variable-visibility-problems

Charles
  • 944
  • 5
  • 9
  • Heya, yeah that's what I figured. Problem is the eventual consistency. What is the timeline for eventual consistency dependent on? – user1848861 Oct 29 '18 at 17:24
  • I did some googling and found that I was wrong. You basically have to do one of those things suggested to force the write to memory. – Charles Oct 29 '18 at 18:37
  • Thanks for following up with your answer. I'm going with AtomicBoolean for now :D – user1848861 Oct 29 '18 at 19:03