1

I am new to off-heap storage in JVM, and ChronicleMap looks good for off heap things. But my main concern is related to performance mostly.

I did run simple test with config

ChronicleMapBuilder<IntValue, BondVOImpl> builder =
                ChronicleMapBuilder.of(IntValue.class, BondVOImpl.class)
                .minSegments(512)
                .averageValue(new BondVOImpl())
                .maxBloatFactor(4.0)
                .valueMarshaller(new BytesMarshallableReaderWriter<>(BondVOImpl.class))
                .entries(ITERATIONS);

and found following results

----- Concurrent HASHMAP ------------------------
Time for putting 7258
Time for getting 678

----- CHRONICLE MAP ------------------------
Time for putting 4704
Time for getting 2246

Read performance is quite low as compare to Concurrent HashMap.

I have tried with 1024/2048 segments, with default Bloat factor, with default Marshaller too. But still same results.

I'm only looking to take advantage of Off Heap feature to reduce GC Pauses and no intentions to use persistent thing or replication or using map beyond the JVM.

So the question is should I use ChronicleMap or stick with ConcurrentHashMap? Or there are any other configs which I can use to enhance performance in case of ChronicleMap?

Thanks in advance.

** Benchmarking using https://github.com/OpenHFT/Chronicle-Map/blob/master/src/test/java/net/openhft/chronicle/map/perf/MapJLBHTest.java :**

enter image description here

Vivek Dhiman
  • 319
  • 1
  • 7
  • 3
    You can not use off-heap storage for objects without marshalling them. Of course, that’s more expensive than just storing a reference. And only you can answer the important question, *did this approach truly “reduce GC Pauses” in your setup*? And how much? Compare the benefit with the costs and then decide… – Holger Nov 18 '20 at 13:34
  • @Holger I suggest you make an Answer of your Comment, so his Question can be marked as resolved. – Basil Bourque Nov 18 '20 at 17:53
  • 1) how did you test to get those numbers - without code and proof, they are irrelevant. 2) off-heap is not necessarily about lower pause times. If you really want lower pause times you should take baby steps first, like analyze GC logs with your _current_ collector (may be improve on that), then most probably move to `Shenandoah` or `ZGC`. As a matter of fact we used off-heap in our code base some time ago and cut it entirely when we moved to `Shenandoah`. – Eugene Nov 19 '20 at 03:15
  • @Holger, Thanks for the response. By looking into Chronicle Map, it has performed better in case of Write operations, but not so good in read. So I was wondering if I can get any advice on Config to improve reads if possible. On current situation, Map occupy lot of mem and GC spending much time to allocate space for other things in the process which end up 7-12 sec GC Pauses time to time. – Vivek Dhiman Nov 19 '20 at 03:17
  • @Eugene, tests are quite simple, just insert 1 million records and then fetch those, nothing fancy. Shenandoah, G1GC we are trying parallel to improve the whole situation. – Vivek Dhiman Nov 19 '20 at 03:22
  • 2
    That’s not a useful comparison. `ConcurrentHashMap` is, as the name suggests, designed for *concurrent* updates. When you’re going to fill a map once, followed by read access only, an ordinary `HashMap` would be sufficient. For real life cases, you’d have to perform multiple tests with varying parameters, e.g. different amount of data, different concurrency, changing patterns of reads and writes and then, see how the map responds to these changed parameters. And having a large GC pause time is not a proof that the `ChronicleMap` will improve the situation. Only an actual test can tell. – Holger Nov 19 '20 at 07:42
  • Agree, but if single thread insertion/read gives me bad performance, Is it going to improve over multithreaded env? I have plan to do multithreaded tests if I get single thread read operation time close to ConcurrentHashMap. – – Vivek Dhiman Nov 19 '20 at 09:07
  • 2
    You are missing the point. `ConcurrentHashMap` is good at dealing with concurrent writes and you didn’t check how `ChronicleMap` performs. Perhaps, that single-threaded one time filling is the only operation where `ChronicleMap` appears to be faster and also only when doing an insufficient benchmark. Are you sure you considered every point discussed in [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/q/504103/2711488) I doubt it, as then, you already checked the influence of environmental parameters. – Holger Nov 19 '20 at 12:43

1 Answers1

2

First of all, I don't buy your test results. You don't provide any code for your benchmark and I suspect the benchmark is fairly inaccurate (yes, benchmarking is fairly complicated subject, and without warmup and all the relevant stuff it's meaningless). Our benchmark gives me this:

-------------------------------- SUMMARY (Read) -----------------------------------------------------------
Percentile   run1         run2         run3      % Variation
50:             0.16         0.16         0.21        17.15
90:             0.23         0.20         0.35        33.48
99:             0.46         0.43         0.78        35.19
99.7:           0.74         1.22         1.59        16.83
99.9:           1.52         1.85         2.84        26.06
worst:         36.46      5187.58       161.09        95.41
-------------------------------------------------------------------------------------------------------------------
-------------------------------- SUMMARY (Write) -----------------------------------------------------------
Percentile   run1         run2         run3      % Variation
50:             2.67         2.69         3.05         8.21
90:             3.02         2.95         3.97        18.75
99:             4.51         6.20         9.06        23.50
99.7:           5.86         9.28        15.55        31.07
99.9:         930.56        22.10       964.86        96.60
worst:       1357.31    226033.66    233373.70         2.12
-------------------------------------------------------------------------------------------------------------------

Numbers are in microseconds, benchmark code is here https://github.com/OpenHFT/Chronicle-Map/blob/master/src/test/java/net/openhft/chronicle/map/perf/MapJLBHTest.java

And we had proofs that Chronicle Map is better than ConcurrentHashMap in most cases - but it depends on how well the marshalling is implemented.

That said, replacing ConcurrentHashMap is not a main use case for Chronicle Map.

  • Offheap maps are capable of storing vast amounts of data, uncomparable with heap data structures, without huge performance penalties.
  • In persisted mode it can be used between multiple processes
  • It can be replicated between hosts
  • etc
Dmitry Pisklov
  • 1,196
  • 1
  • 6
  • 16
  • 1
    Regardless of “how well the marshalling is implemented”, it can’t compete with an operation that just passes a reference. Since the actual advantage of an off-heap memory is the intended reduction of the (heap) memory management overhead, you can’t measure it with a simple microbenchmark that does just get and put. – Holger Nov 20 '20 at 16:08
  • @Holger put and get in a map are not "simply passing the reference". The hash table lookup is the main cost in both put and get operations, especially in the contended environment, and that is pretty efficient in Chronicle Map. In addition, Values library can easily compete with passing the reference - it doesn't deserialize the data into a heap object but instead gives a flyweight which holds a reference to the raw bytes and accessors on that object directly talk to offheap memory. – Dmitry Pisklov Nov 20 '20 at 19:04
  • Basically despite the main benefit of using Chronicle Map coming from its "off-heapness", it can be used to be a drop-in replacement of ConcurrentHashMap as the documentation suggests – Dmitry Pisklov Nov 20 '20 at 19:05
  • Thanks @DmitryPisklov, did you tried same tests with ConcurrentHashMap? – Vivek Dhiman Nov 23 '20 at 08:29
  • 1
    @DmitryPisklov there is a contradiction in using it as “a drop-in replacement of ConcurrentHashMap” and “gives a flyweight which holds a reference to the raw bytes”. The latter obviously only works when the value type in the particular use case can be replaced by a flyweight object. Further, the flyweight object still is a heap object, even when its properties are stored off-heap. You need a significant amount of data connected to these properties to make that indirection more efficient than an ordinary heap object. Advertising it as “drop-in replacement” is dubious. It’s a specialized tool. – Holger Nov 23 '20 at 11:55
  • @Holger there's no contradiction here. Flyweight object is optimization technique unrelated to using ChronicleMap as a replacement of ConcurrentHashMap. The flyweight is normally reused but it obviously depends on the use case scenario. We have clients using ChronicleMap as a replacement of ConcurrentHashMap for the sake of low garbage footprint, and when optimized using the techiques described (and others) it performs better. Arguably, the benefit indeed comes from the fact it's an offheap storage, but that doesn't contradict to it being used as a ConcurrentHashMap replacement. – Dmitry Pisklov Nov 23 '20 at 13:01
  • 1
    So we agree that when using the ChronicleMap as a drop-in replacement of ConcurrentHashMap, as the OP apperently does, the marshalling is unavoidable? – Holger Nov 23 '20 at 13:17
  • @DmitryPisklov, yes I did, and results are not encouraging (except worst case scenario), posted in question itself. – Vivek Dhiman Nov 26 '20 at 05:56
  • @VivekDhiman as has been correctly pointed out by Mr Holger linear single-threaded test doesn't show you much. You need full-fledged integrated system latency test to see if using off heap makes a difference in your latency profile. In isolated environment and in linear (single-threaded) read/writes offheap writes will always be smaller, especially on writes. – Dmitry Pisklov Nov 26 '20 at 15:31