12

It came to my attention that java.util.HashMap produces garbage for the GC when used on my high-performance system, which basically is a selector reading from the network. Is there an alternative to java.util.HashMap (i.e. does not even need to implement java.util.Map, in other words, it can have its own API) that I can use that will leave no garbage behind?


GARBAGE = objects that go out of scope and will have to be collected by the GC.


For @durron597:

public static void main(String[] args) {

    Map<String, String> map = new HashMap<String, String>();

    while(true) {

        map.put("foo1", "bah1");
        map.put("foo2", "bah2");

        map.remove("foo1");

        Iterator<String> iter = map.keySet().iterator();

        while(iter.hasNext()) {
            iter.next();
        }
    }
}

Now run that with -verbose:gc and see what happens... :)

chrisapotek
  • 6,007
  • 14
  • 51
  • 85
  • 2
    What do you mean by "garbage"? – durron597 May 23 '14 at 11:48
  • 1
    Garbage as in GC behavior or memory usage? Or do you just want to evict old entries when memory runs low? – nanofarad May 23 '14 at 11:50
  • @durron597 garbage is objects not being in use anymore – NoDataDumpNoContribution May 23 '14 at 11:50
  • Let's try again. Objects that go out of scope **in what scenarios**? When you remove from the table? Are you talking about `Map.Entry` objects? You say it produces "a lot" of garbage. What do you mean by "a lot"? – durron597 May 23 '14 at 11:52
  • @durron597 I don't know. I am just using a HashMap and it is producing garbage as I put, remote, get, iterate. I did not write that code so I don't know where and why it is producing garbage :) I removed the "a lot". I cannot produce any garbage, that's my requirement. – chrisapotek May 23 '14 at 11:53
  • there is bound to be some values cached when you use map. i am not sure about the high performance system what you excatly mean. enterprise apps use heavy hasmaps for caching till now i heard no problem on the profiler – vikeng21 May 23 '14 at 11:54
  • @durron597 I will produce an example, just hand on tight there pal. – chrisapotek May 23 '14 at 11:57
  • You could always write your own hash table implementation that use a pool of entries and recycled them, vs using new and GC. This is a common strategy outside of GCed environments. – Hot Licks May 23 '14 at 11:58
  • Probably the use of the HashMap is very intensive (adding and removing entries). This causes in much "dead" objects that the gc must collect to allow the reuse of the allocated space. To force this collecting try to force the GC to collect more often by direct calling it. This results in smaller portions of your "garbage behind". – double_word_disruptor May 23 '14 at 11:59
  • 1
    @double_word_disruptor - There's rarely a situation where forcing GC more often will improve performance. – Hot Licks May 23 '14 at 11:59
  • 2
    @double_word_disruptor [Don't ever call `System.gc()`. Ever.](http://stackoverflow.com/a/2414120/1768232) – durron597 May 23 '14 at 12:00
  • @durron597 Right, a rarely situation. But never? – double_word_disruptor May 23 '14 at 12:03
  • 1
    @double_word_disruptor https://blog.codecentric.de/en/2010/08/invoking-system-gc-can-have-serious-impact/ [Here's one example that's technically a good use, but it's not really an example, click to see why](http://stackoverflow.com/a/66694/1768232) – durron597 May 23 '14 at 12:05
  • Okay, so really you don't want to have *any* extra objects, period. I think @Slanec's answer is correct (I'm one of the upvotes on it) – durron597 May 23 '14 at 12:07

4 Answers4

11

Yes. Take a look e.g. at Goldman Sachs collections.

They have a complete reimplementation of the JDK's collection framework (and much more) with an emphasis on low memory footprint. For example, their HashMap doesn't create the Entry objects until they really need to. Look at the documentation here.

There's also Javolution, a smaller library with a slightly different purpose - mainly for close-to-realtime and time-predictable classes, this also implies low garbage.

If you want to store primitives (which avoids the creation of their wrappers), look at one of these:

  • Trove - the "standard" collections for primitives
  • Goldman Sachs collections, again
  • HPPC - lower level access, often slightly faster than Trove, but enables you to shoot yourself into the foot more easily
  • Koloboke - a Trove fork made by people from OpenHFT. Insanely fast, evolving fast. As of now (September 2014), only Maps and Sets are supported.

EDIT 2020:

Also see https://github.com/TimeAndSpaceIO/SmoothieMap.

Petr Janeček
  • 37,768
  • 12
  • 121
  • 145
  • 2
    GS Collections has migrated to the Eclipse Foundation and is now Eclipse Collections - http://www.eclipse.org/collections/ – Donald Raab Mar 05 '17 at 01:51
7

We have also written a collection of data structures called CoralBits that provides high-performance with zero garbage creation. It re-uses iterators and pools map entry objects. For maps that use primitives as keys, we wrote IntMap and LongMap. For a general purpose map we wrote PooledHashMap which implements java.util.Map so you can swap in your code for zero garbage.

Trove and Javolution are other alternatives but we have found that Javolution creates garbage in some situations.

CoralBits also provides a MemorySampler instrumentation class that you can use to find out where garbage is being created in your code. In the case of a java.util.HashMap the culprit is:

java.util.HashMap.createEntry(HashMap.java:901)

You can take a look in this article written by me that gives an example of how to use MemorySampler to detect garbage in your applications.

Disclaimer: I am one of the developers of CoralBits.

rdalmeida
  • 1,806
  • 12
  • 16
  • How can you tell the line where the garbage is being created? – JohnPristine May 23 '14 at 12:39
  • The line is where the object is being allocated. Whether or not it will become garbage it is another story. In the case of `java.util.HashMap` it is released for the GC so it indeed becomes garbage. We use `-javaagent` to perform this instrumentation. – rdalmeida May 23 '14 at 12:43
4

You can avoid a lot of the Garbage Collection if you store your map entries off-heap. There are a number of libraries that can help you with that:

leventov
  • 14,760
  • 11
  • 69
  • 98
Andrejs
  • 26,885
  • 12
  • 107
  • 96
  • This can be true, but it depends on read-write access patterns, and the API of the off-heap collection. On the read path, some off-heap collections can produce more object churn and therefore garbage than on-heap collections, if they will deserialize the off-heap data and create new Java object representations of it each time the data is read. On-heap collections can avoid this by returning the same object which is already on-heap each time. – npgall Nov 23 '16 at 18:25
2

The LibGdx libraries have ArrayMap, which is a no-garbage version of hashmap.

http://libgdx.badlogicgames.com/

They have several other no-garbage collections. https://github.com/libgdx/libgdx/tree/master/gdx/src/com/badlogic/gdx/utils

They work great, with a minor limitation of not allowing nested recursion for the same iterator.

Nick Bilyk
  • 352
  • 2
  • 8