0

I have a static HashMap<UUID, MyObject> ALL = new HashMap<>(); which is used in multi-threading.

To reproduce the error, I made this code:

HashMap<Integer, String> list = new HashMap<>();

list.put(1, "str 1");
list.put(2, "str 2");

new Thread(() -> {
    while(true) {
        ArrayList<String> val;
        synchronized(list) {
            val = new ArrayList<>(list.values());
        }
        System.out.println(val.toString());
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}).start();

new Thread(() -> {
    while(true) {
        list.put(new Random().nextInt(), "some str");
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}).start();

But, after some seconds (around 10), I get this error with Java 16 and Java 17:

java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
    at java.util.HashMap.valuesToArray(HashMap.java:973) ~[?:?]
    at java.util.HashMap$Values.toArray(HashMap.java:1050) ~[?:?]
    at java.util.ArrayList.<init>(ArrayList.java:181) ~[?:?]

With Java 8, I get this:

Exception in thread "Thread-0" java.util.ConcurrentModificationException
    at java.util.HashMap$HashIterator.nextNode(HashMap.java:1473)
    at java.util.HashMap$ValueIterator.next(HashMap.java:1502)
    at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
    at java.util.ArrayList.<init>(ArrayList.java:178)

To test, I remove synchronized key-word, then try again in Java 17 and I get this:

java.util.ConcurrentModificationException: null
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1631) ~[?:?]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[?:?]
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) ~[?:?]

Those error seems very strange, especially the first one. I suspect that they comes from the JRE itself. I'm using Java 17.0.1 build 17.0.1+12-LTS-39.

How can I get all values from another thread?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Elikill58
  • 4,050
  • 24
  • 23
  • 45
  • Try to stop / comment your other threads that access HashMap ALL if possible. After that you may be sure about the problem is about concurrent access or not. – tango-1 Dec 17 '21 at 20:20
  • You might want to create a synchronized map using Collections.synchronizedMap. https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedMap-java.util.Map- – Cheng Thao Dec 17 '21 at 20:23
  • @ChengThao warn: I'm using Java 17 – Elikill58 Dec 17 '21 at 20:31
  • @Elikill58 You can still use that in Java 17. – Cheng Thao Dec 17 '21 at 20:41
  • 1
    Another possible solution is to use `ConcurrentHashMap`. – Stephen C Dec 18 '21 at 00:02
  • 5
    As to the strangeness of this: this is exactly what you should expect if you have two threads accessing / updating a non-thread-safe data structure without doing external synchronization. The code is simply wrong. – Stephen C Dec 18 '21 at 00:05
  • @StephenC I understand what you said. The problem is that there is a thread that manage files, which let main thread don't lag. I should do something like this. And so, I used `synchronized` to prevent those type of issue. But I will try the ConcurrentHashMap, thanks ! – Elikill58 Dec 18 '21 at 00:19
  • In a situation like that, it is impossible to get a 100% accurate snapshot without some "lag". You need to modify your application so that it will work with either a lag, or with information that is not an accurate snapshot of the map. – Stephen C Dec 18 '21 at 01:24
  • If you are using `synchronizedMap` there is a very important caveat in the javadocs. *" It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views: ..."* That means that you cannot iterate without (umm) "lag". – Stephen C Dec 18 '21 at 01:27
  • Never mind a `ArrayIndexOutOfBoundsException`, you can get [an infinite loop with unsynchronised operations on a HashMap](https://stackoverflow.com/questions/35534906/java-hashmap-getobject-infinite-loop). Your “discovery” is one of many possible and permitted (since the contract of synchronised access has been broken) outcomes. – Bohemian Dec 19 '21 at 01:38

2 Answers2

4

First of all, you should use better variable names. Even a completely non-informative name is better than using list as the variable name for a HashMap. A HashMap is NOT a list, and it doesn't even behave like a (proper) list when you iterate it. That variable name is just misleading.

So the problem with your code is that it is not synchronizing correctly. The version as written is using synchronized when updating the HashMap, but not when you are accessing it. To get the proper happens before relationships need to make this code work, both the reader and updater threads would need to use synchronized.

Without that a happens before chain, the Java Memory Model does not guarantee that primitive write operations performed by one thread are visible to another. In this case, it means that HashMap operations performed by the reader are liable to encounter stale values. This can cause all sorts of things to go wrong1, including incorrect results, ArrayIndexOutOfBoundsExceptions, NullPointerExceptions and even infinite loops.

In addition, if you simultaneously iterate and update a HashMap you are liable to get a ConcurrentModificationException ... even if the operations are done in a way that ensures that a happens before chain exists.

In short ... this code is wrong.

1 - The actual failure mode and frequency are liable to depend on factors such as your JVM version, your hardware (including the number of cores), and whatever else is going on in your application. And various things you can try to investigate the behavior are liable to make the failure change ... or go away.


So, how can you fix it?

Well, there are two approaches:

  1. Make sure that both the reader and updater threads access the HashMap from inside a synchronized block. In the reader case, be sure to put the entire operation that is iterating the map's value view into the synchronized block. (Otherwise you will get CME's)

    The down-side is that the reader will block the updater and vice-versa. That can lead to "lag" in either thread. (It is probably the updater that you are worried about. For that thread, the "lag" will be proportional to the number of entries in the map ... and what you are doing with the map entries.)

    This is more or less equivalent to using a Collections.synchronizedMap wrapper. You will get the same amount of "lag". Note the important caveat in the javadoc about iterating using a synchronized map wrapper. (Look for "It is imperative that ...")

  2. Change HashMap to ConcurrentHashMap. That will remove the need to perform operations inside synchronized blocks. The ConcurrentHashMap class is thread-safe ... in the sense that you won't need to worry about memory model induced exceptions and heisenbugs.

    The down-side is that iterating a ConcurrentHashMap does not give you a clean snapshot of the map state. If an entry exists at the start of the iteration and hasn't been removed by the end of the iteration, you are guaranteed to see it. But if entries are added or remove, you may or may not see them.


Declaring the Map variable list as volatile won't fix this. Doing that only gives a happens before for reads and writes of the reference variable. But it doesn't give any happens before relations between the operations on the HashMap. So if the reader and updater threads happen to run simultaneously, bad things will happen.

In practice, adding volatile will turn this into a case where the problems occur less frequently, and are much harder to reproduce or test for. IMO, it makes the problem worse.

(Besides, if list is a local variable, as it appears to be in your example, it can't be declared as volatile anyway.)


Q: Is there a solution with O(1) operations that gives you clean map snapshot semantics without lag?

A: AFAIK, no such data structure has been invented / discovered. Certainly, there is no Map implementation in Java SE with those properties.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thanks for your complete answer ! For variable names, I know, I just rename them for this snippet. The map will contains less than 100 items, So it doesn't affect to wait for iterate. So, I think I will use the first option. Also, you didn't talk in your answer about the `synchronizedMap` but you did in comment. Can it be better then other option ? – Elikill58 Dec 19 '21 at 09:33
  • 1) *"For variable names, I know, I just rename them for this snippet."* - It is **>>us<<** who you were / are misleading with the bogus variable name. (I don't actually care what you have in your real code ... if I don't have to read it ) 2) I thought you said you were worried about "lag" in the comments. – Stephen C Dec 19 '21 at 12:49
  • XD I understand. Yes I'm worried about lag because one thread, after getting those object, will make I/O operation. But, wait few ns to have a good list and use them – Elikill58 Dec 19 '21 at 13:11
0

"I have a static HashMap<UUID, MyObject> ALL = new HashMap<>(); which is used in multi-threading"

Where is the error!?? ;) (1. static 2. HashMap (NOT thread-safe) 3. multi-threading)

TLDR

Try:

static Map<UUID, MyObject> ALL = java.util.Collections.synchronizedMap(new HashMap<>());

(to "use in multi-threading";).

Javadoc: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Collections.html#synchronizedMap(java.util.Map)

xerx593
  • 12,237
  • 5
  • 33
  • 64