using thread local to avoid map creation to reduce GC but failed

Question

To avoid online GC issues.

Background

The original map will be copied to the thread (contained in a thread pool) and within that thread, the copied map could be updated and after the update, some copied maps might feed back to the original map.

Experient considerations

The online conditions have two typical conditions compared to my local mac book:

much better performant servers (CPU, memory, and IO)
high throughputs (million-level QPS)

The size info for the origMap could be: 50 keys, and each value is about 50 chars.

Current solution

I am now using a ThreadLocal to build up a ReusableMap to ensure each map is bond to the thread and when the copy is required, and the map to the thread is already created, we can directly use the map.

_{of course, we will need to clear the map first and copy the content from the original map}.

I thought it would reduce the GC, but as I run some test using jmh and monitor the result in Visual GC via jvisualvm; I sadly found it was not as I expected. Still there are lots of GCs as before.

Updated 2020-02-21

First, really thanks to the help of @Holger and @GotoFinal, I tried other different options with my limited understanding. But so far, so bad, nothing works with my local test.

Nothing beneficial comes out, and I think I will try something different to dig deeper related to JVM optimisation and caching tech.

Just for the reference, the tests I've run as follows:

adjust the map size in key aspect, value size in length aspect;
use plain huge loop by removing jmh to eliminate lurking influences;
use several maps (instead of just one - since there are scenarios, we need to pass several);
make restrictive changes in child thread to maintain higher re-use in entry and node aspect if using
```
cacheMap.putAll(origMap)
cacheMap.keySet().retainAll(origMap.keySet())
```
run the tests longer from 10 mins to 2h30m;

Some code to demonstrate what I just mentioned:

public class ReusableHashMapTwoCopy {
    private static final String DEFAULT_MAP_KEY = "defaultMap";
    /**
     * weak or soft reference perhaps could be used: https://stackoverflow.com/a/299702/2361308
     * <p>
     * via the static ThreadLocal initialized, each thread will only see the value it set itself;
     */
    private static ThreadLocal<Map> theCache = new ThreadLocal<>();

    /**
     * the default usage when there is only one map passed from parent
     * thread to child thread.
     *
     * @param origMap the parent map
     * @param <K>     generic type for the key
     * @param <V>     generic type for the value
     * @return a map used within the child thread - the reusable map
     */
    public static <K, V> Map<K, V> getMap(Map<K, V> origMap) {
        return getMap(DEFAULT_MAP_KEY, origMap);
    }


    public static <K, V> Map<K, V> getMap() {
        return getMap(DEFAULT_MAP_KEY);
    }

    /**
     * clone the parent-thread map at the beginning of the child thread,
     * after which you can use the map as usual while it's thread-localized;
     * <p>
     * no extra map is created for the thread any more - preventing us from creating
     * map instance all the time.
     *
     * @param theMapKey the unique key to specify the map to be passed into the child thread;
     * @param origMap   the parent map
     * @param <K>       generic type for the key
     * @param <V>       generic type for the value
     * @return the cached map reused within the child thread
     */
    public static <K, V> Map<K, V> getMap(String theMapKey, Map<K, V> origMap) {
        Map<String, Map<K, V>> threadCache = theCache.get();
        if (Objects.isNull(threadCache)) {
//            System.out.println("## creating thread cache");
            threadCache = new HashMap<>();
            theCache.set(threadCache);
        } else {
//            System.out.println("**## reusing thread cache");
        }
        Map<K, V> cacheMap = threadCache.get(theMapKey);
        if (Objects.isNull(cacheMap)) {
//            System.out.println("  ## creating thread map cache for " + theMapKey);
            cacheMap = new HashMap<>();
        } else {
//            System.out.println("  **## reusing thread map cache for " + theMapKey);
            cacheMap.clear();
        }
        if (MapUtils.isNotEmpty(origMap)) {
            cacheMap.putAll(origMap);
            cacheMap.keySet().retainAll(origMap.keySet());
        }
        threadCache.put(theMapKey, cacheMap);
        return cacheMap;
    }

    public static <K, V> Map<K, V> getMap(String theMapKey) {
        return getMap(theMapKey, null);
    }




    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
//        print(MyState.parentMapMedium_0);
//        print(MyState.parentMapSmall_0);
    }

    private static void blackhole(Object o) {

    }


    @Benchmark
    @Fork(value = 1, warmups = 0, jvmArgs = {"-Xms50M", "-Xmx50M"})
    @Warmup(iterations = 1, time = 5)
    @Timeout(time = 3, timeUnit = TimeUnit.HOURS)
    @BenchmarkMode(Mode.Throughput)
    @OutputTimeUnit(TimeUnit.MINUTES)
    @Measurement(iterations = 1, time = 150, timeUnit = TimeUnit.MINUTES)
    public void testMethod() throws Exception {
        final Map<String, String> theParentMap0 = MyState.parentMapSmall_0;
        final Map<String, String> theParentMap1 = MyState.parentMapSmall_1;
//        final Map<String, String> theParentMap0 = MyState.parentMapMedium_0;
//        final Map<String, String> theParentMap1 = MyState.parentMapMedium_1;
        ThreadUtils.getTheSharedPool().submit(() -> {
            try {
                Map<String, String> theChildMap0 = new HashMap<>(theParentMap0);
                theChildMap0.put("test0", "child");

                Map<String, String> theChildMap1 = new HashMap<>(theParentMap1);
                theChildMap1.put("test1", "child");

                for (int j = 0; j < 1_0; ++j) {
                    blackhole(theChildMap0);
                    blackhole(theChildMap1);
                    sleep(10);
                }
            } catch (Exception e) {
                System.err.println(e.getMessage());
            }
        }).get();
    }


    private static void print(Object o) {
        print(o, "");
    }

    private static void print(Object o, String content) {
        String s = String
                .format("%s: current thread: %s map: %s", content, Thread.currentThread().getName(), toJson(o));
        System.out.println(s);
    }

    @State(Scope.Benchmark)
    public static class MyState {
        // 20 & 100 -> 2.5k
        static Map<String, String> parentMapSmall_0 = generateAMap(20, 100);
        static Map<String, String> parentMapSmall_1 = generateAMap(20, 100);
        // 200 & 200 -> 45k
        static Map<String, String> parentMapMedium_0 = generateAMap(200, 200);
        static Map<String, String> parentMapMedium_1 = generateAMap(200, 200);
    }

    private static Map<String, String> generateAMap(int size, int lenLimit) {
        Map<String, String> res = new HashMap<>();
        String aKey = "key - ";
        String aValue = "value - ";
        for (int i = 0; i < size; ++i) {
            aKey = i + " - " + LocalDateTime.now().toString();
            aValue = i + " - " + LocalDateTime.now().toString() + aValue;
            res.put(aKey.substring(0, Math.min(aKey.length(), lenLimit)),
                    aValue.substring(0, Math.min(aValue.length(), lenLimit)));
        }
        return res;
    }
}

running in an env with `"-Xms100M", "-Xmx100M"` _will_ require lots of GC, of course. also, how about describing the problem you have in a simple way? — Eugene, Feb 18 '20 at 16:31
@Holger thanks for the reply ;) currently there are performance issues (I suspect) caused by avoidable GCs. The QPS is million level, and the hosts are suffering much from GC issues. I am now looking into it trying to reduce the map creation since it can be re-used (as I suppose, perhaps it's a wrong direction). — Hearen, Feb 19 '20 at 03:29
@Eugene I am restricting the heap size to make the experients more close to the realistic conditions since my Mac is less performant compared to the online servers. Is there any better way to mock the real conditions in your experience? Any help will be appreciated ;) — Hearen, Feb 19 '20 at 03:31

GotoFinal · Answer 1 · 2020-02-19T10:38:29.210

1

HashMap contains array of Node object inside, and if you call hashMap.clear() this array is cleared, so all the Node objects are now available for Garbage collection.
So caching map like that will not help at all.

If you want to limit about of GC maybe you could just use ConcurrentHashMap? If you need to split work between threads you could just pass list/array of keys to each thread that they should operate on and return list of updated values. Hard to tell more without exact description of problem you are trying to solve.

But first you also should think if you really need this, you really need to stress java a lot to get real issues with GC that can't be solved by some simple tuning, as long as you will give some normal amount of memory to java process.

Another solution is one proposed by Holger:
Instead of using cacheMap.clear() just do:

cacheMap.putAll(origMap)
cacheMap.keySet().retainAll(origMap.keySet())

Each time you want to use this map.
Additionally if you don't need to worry about leaking keys - so all keys can be present in memory all time - then you could also use a bit more risky solution that would get rid of all allocations but would require to use map in special way:

cacheMap.replaceAll((key, value) -> null);

Then all keys are still present in map, but values are null, so then when using a map you can just ignore null values. You could also use some kind of null object if needed/possible.

edited Feb 19 '20 at 10:38

answered Feb 18 '20 at 17:16

GotoFinal

3,585
2
18
33

Thanks for the reply, I'm thinking about your suggestions. I suppose you are trying to say: "so all the Node objects are ~not~ available for Garbage collection"? I think I understand that as my questions *deliberately* mentioned the clear operation and emphasize the purpose: avoid map creation process. – Hearen Feb 19 '20 at 03:38
1. The project is kind of huge and I have to make sure the change is almost **un-sensible** to all the affected modules; 2. ConcurrentHashMap can't be used since each thread has to own its own map instead of sharing the same map; 3. Array/List is too intrusive and all related modules have to update the data structure, anyway it's also a good solution but perhaps in anothe case. Thanks for the help ;) – Hearen Feb 19 '20 at 03:41
3

I guess, it was suppose to mean “so all the Node objects are *now* available for Garbage collection”. One way to mitigate the effect is to replace the `cacheMap.clear(); cacheMap.putAll(origMap);` sequence with `cacheMap.putAll(origMap); cacheMap.keySet().retainAll(origMap.keySet());`. Then, all internal node objects whose key is still present are reused. – Holger Feb 19 '20 at 09:30
Yes, I made horrible typo. – GotoFinal Feb 19 '20 at 09:42
@Hearen how big are this maps? – GotoFinal Feb 19 '20 at 10:12
In addition to @Holger method, if the amount of original keys is limited and you are not worried about leaking keys you could iterate over map and set all values to null but leave keys, then all nodes will be reused. And you could wrap such map into some wrapper map that would automatically handle null values as absent values. – GotoFinal Feb 19 '20 at 10:30
@GotoFinal Really thanks for all the help, bro ;). The size info could be: 50 keys, and each value is about 50 `chars`. As for your idea to leave the values `null`, I'd prefer the method suggested by Holger using `retainAll(origMap.keySet())` to make sure the space is properly gc-ed if it's not needed. – Hearen Feb 20 '20 at 07:21
@Holger I really appreciate the help and the awesome idea offered in the last comment. I am trying to have more tests (as close as possible to the online situation). I will let you know if there are good news. Thank you for the help ;) – Hearen Feb 20 '20 at 07:27

using thread local to avoid map creation to reduce GC but failed

Background

Experient considerations

Current solution

Updated 2020-02-21

1 Answers1