34

I habitually use HashMap in my programs, since I know it is usually the most efficient (if properly used) and can cope with large maps easily. I know about EnumMap which is very useful for enumeration keys, but often I am generating a small map which will never get very big, is likely to be discarded pretty soon, and has no concurrency issues.

Is HashMap<K,V> too complicated for these small, local and temporary uses? Is there another, simple, implementation which I can use in these cases?

I think I'm looking for a Map implementation which is analogous to ArrayList for List. Does it exist?


Added later after responses:

Here is a scenario where a slow but very simple implementation might be better -- when I have many, many of these Maps. Suppose, for example, I have a million or so of these tiny little maps, each with a handful (often less than three) of entries. I have a low reference rate -- perhaps I don't actually reference them before they are discarded most of the time. Is it still the case that HashMap is the best choice for them?

Resource utilisation is more than just speed -- I would like something that doesn't fragment the heap a lot and make GCs take a long time, for example.

It may be that HashMap is the right answer, but this is not a case of premature optimisation (or at least it may not be).


Added much later after some thought:

I decided to hand-code my own SmallMap. It is easy to make one with AbstractMap. I have also added a couple of constructors so that a SmallMap can be constructed from an existing Map.

Along the way I had to decide how to represent Entrys and to implement SmallSet for the entrySet method.

I learned a lot by coding (and unit-testing this) and want to share this, in case anyone else wants one. It is on github here.

River
  • 8,585
  • 14
  • 54
  • 67
Steve Powell
  • 25,354
  • 8
  • 41
  • 44
  • 1
    Just use `HashMap` and set an appropriate initial capacity, you can't do better than that (unless you can use `EnumMap` of course). – Viruzzo Jan 12 '12 at 13:42
  • The reason for not there being a BigSlowMap and a FastSmallMap is that the basic implementation is adaptable enough. – Viruzzo Jan 12 '12 at 13:43
  • See also the answers to [this](http://stackoverflow.com/questions/633299/anyone-know-of-a-java-util-map-implementation-optimized-for-low-memory-use) question. – glyn Jan 12 '12 at 15:25
  • @viruzzo Speed is not the issue, here. Memory utilisation may be. – Steve Powell Jan 12 '12 at 15:30
  • @Glyn Normington Thanks; the answers there aren't very illuminating, but the question is right on the money. Why didn't I find this when I looked for it?? :-) – Steve Powell Jan 12 '12 at 15:32
  • need more information -- why are you creating millions of these tiny maps? Is this a web application? Sounds like you might need a custom data structure. – hvgotcodes Jan 12 '12 at 15:50
  • @hvgotcodes Here is a hypothetical scenario. I am processing a queue of messages. Each message can have a collection of properties, which is a map from property name (String) to property value (Object). Often messages have no properties; often just one or two. These messages come in rather frequently on some socket (say), are created for processing (with property maps), and discarded after processing. I want the memory and GC overhead for each message to be as small as possible. A HashMap is not cheap enough and overkill. – Steve Powell Apr 17 '12 at 10:06

10 Answers10

23

There is no standard small implementation of Map in Java. HashMap is one of the best and most flexible Map implementations around, and is hard to beat. However, in the very small requirement area -- where heap usage and speed of construction is paramount -- it is possible to do better.

I have implemented SmallCollections on GitHub to demonstrate how this might be done. I would love some comments on whether I have succeeded. It is by no means certain that I have.

Although the answers offered here were sometimes helpful, they tended, in general, to misunderstand the point. In any case, answering my own question was, in the end, much more useful to me than being given one.

The question here has served its purpose, and that is why I have 'answered it myself'.

Steve Powell
  • 25,354
  • 8
  • 41
  • 44
  • Have you by any chance done a performance test to compare your implementation vs hashmap? – dan b Feb 18 '15 at 09:37
  • 1
    @danb As it happens, I have. I tried to put the results in the Git repo, but they aren't great. I ran out of interest :-) But the overall result was that SmallCollections is indeed slower than HashMap, for more than about five entries, but consumes less memory, especially for less than two entries or so. It is faster, too, if there are less than three entries. The zero-entry case is fantastically efficient :-) – Steve Powell Apr 08 '15 at 10:46
12

I think this is premature optimization. Are you having memory problems? Performance problems from creating too many maps? If not I think HashMap is fine.

Besides, looking at the API, I'm not seeing anything simpler than a HashMap.

If you are having issue, you could roll your own Map implementation, that has very simple internals. But I doubt you would do better than default Map implementations, plus you have the overhead of making sure your new class works. In this case there might be a problem with your design.

hvgotcodes
  • 118,147
  • 33
  • 203
  • 236
  • 2
    `Hashtable` is synchronized, so it would be worse. – Viruzzo Jan 12 '12 at 13:41
  • 3
    Well, I understand what premature optimisation is and this isn't it. I have to make an implementation choice, and I want to make it right. Simplicity misses the point also, the API is the same `Map<,>` for all implementations I might choose. I'll add to the question to make this crystal clear why I might want a simpler, slower `Map` implementation. – Steve Powell Jan 12 '12 at 15:10
5

A HashMap is possibly the most light weight and simple collection.

Sometimes the more efficient solution is to use a POJO. e.g. if your keys are field names and/or your values are primitives.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
3

IdentityHashMap

If you are truly concerned about performance, consider IdentityHashMap.

This class tracks keys by their reference in memory rather than by the content of the key. This avoids having to call the hashCode method on your key, which may or may not be lengthy if not cached internally within its class.

This class provides constant-time performance for the basic operations (get and put).

Read the Javadoc carefully when choosing this implementation of Map.

Map.of

Java 9 and JEP 269: Convenience Factory Methods for Collections brought us the convenience of literals syntax for concisely defining a map with Map.of (and list with List.of).

If your situation can use an unmodifiable map, use this simple syntax.

Note that the Map.of methods return a Map interface object. The underlying concrete Map class is unknown and may vary depending on your particular code and the version of Java. The Java implementation is free to optimize in its choice of a concrete class. For example, if your Map.of uses an enum for keys, the Map.of command might choose an EnumMap, or it might choose some yet-to-be-devised class optimized for very small maps.

Map< DayOfWeek , Employee > dailyWorker = 
    Map.of(
        DayOfWeek.MONDAY , alice ,
        DayOfWeek.TUESDAY , bob ,
        DayOfWeek.WEDNESDAY , bob ,
        DayOfWeek.THURSDAY , alice ,
        DayOfWeek.FRIDAY , carol ,
        DayOfWeek.SATURDAY , carol ,
        DayOfWeek.SUNDAY , carol
    )
;

Premature Optimization

As others noted, you are likely falling into the trap of premature optimization. Your app's performance is not likely to be impacted significantly by your choice of small map implementation.

Other considerations

There are other considerations you should likely be focusing on, such as concurrency, tolerance for NULLs, and iteration-order.

Here is a graphic chart I made listing those aspects of the ten implementations of Map bundled with Java 11.

Table of map implementations in Java 11, comparing their features

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • This is a very good answer but it misses the point of the question by confusing performance with speed. – Steve Powell Jan 30 '20 at 08:10
  • In a system dealing with a fast stream of messages, each with a very small number of properties with arbitrary names. Many messages will be handled. The routing and transfer of the messages rarely needs to use the properties. When it does it may only look at one or two. A hashmap for these is overkill, fragments the heap and uses more space than necessary. A simple array-based solution is quicker to create and use, has a smaller footprint and is potentially reusable from a pool with lower overhead than a hashmap. Performance isn’t only about get and put speeds and large maps. – Steve Powell Jan 30 '20 at 08:31
  • Oh, and I’m not writing an app. – Steve Powell Jan 30 '20 at 08:34
2

Android has an ArrayMap with the intent of minimizing memory. In addition to being in the core, it's in the v4 support library, which, theoretically, should be able to compile for the Oracle or OpenJDK JREs as well. Here is a link to the source of ArrayMap in a fork of the v4 support library on github.

Travis Well
  • 947
  • 10
  • 32
  • 1
    Thank you for this reference; it is quite interesting. The meat of the code is in the base class [SimpleArrayMap](https://github.com/futuresimple/android-support-v4/blob/master/src/java/android/support/v4/util/SimpleArrayMap.java), which bears some examination. – Steve Powell Jun 28 '17 at 21:36
  • I have considered, mainly for ArrayMap, trimming all the Android dependencies and porting Android collections and utilities to Oracle/OpenJDK (but I probably won't). @StevePowell If you do anything along those lines, post a comment so I can check it out. – Travis Well Jun 29 '17 at 18:53
2

HashMap is a good choice because it offers average case O(1) puts and gets. It does not guarantee ordering though like SortedMap implementations (i.e. TreeMap O(log n) puts and gets) but if you have no requirement for ordering then HashMap is better.

Dev
  • 11,919
  • 3
  • 40
  • 53
1

I agree with @hvgotcodes that it is premature optimization but it is still good to know all tools in the toolbox.

If you do a lot of iterations over what is in a map, a LinkedHashMap is usually quite a lot faster than a HashMap, if you have a lot of threads working with the Map at the same time, a ConcurrentHashMap is often a better choice. I wouldn't worry about any Map implementation being inefficient for small sets of data. It is typically the other way around, an incorrectly constructed map easily gets inefficient with large amounts of data if you have bad hash values or if something causes it to have too few buckets for its load.

Then of course there are cases when a HashMap makes no sense at all, like if you have three values which you will always index with the keys 0, 1 and 2 but I assume you understand that :-)

Fredrik
  • 5,759
  • 2
  • 26
  • 32
1

HashMap uses more or less memory (when created) depending on how you initialize it: more buckets mean more memory usage, but faster access for large amounts of items; if you need only a small number of items you can initialize it with a small value, which will produce less buckets that will still be fast (since they will each receive a few items). There is no waste of memory if you set it correctly (the tradeoff is basically memory usage vs speed).

As for heap fragmentation and GC cycle wasting and whatnot, there is not much that a Map implementation can do about them; it all falls back to how you set it. Understand that this is not about Java's implementation, but the fact that generic (as in, for example, cannot assume anything about key values like EnumMap does) hashtables (not HashTables) are the best possible implementations of a map structure.

Viruzzo
  • 3,025
  • 13
  • 13
0

I also was interested and just for an experiment I created a map which stores keys and values just in fields and allows up to 5 entries. It consumes 4 less memory and works 16 times faster than HashMap https://github.com/stokito/jsmallmap

Sergey Ponomarev
  • 2,947
  • 1
  • 33
  • 43
-1

There is an alternative called AirConcurrentMap that is more memory efficient above 1K Entries than any other Map I have found, and is faster than ConcurrentSkipListMap for key-based operations and faster than any Map for iterations, and has an internal thread pool for parallel scans. It is an ordered i.e. NavigableMap and a ConcurrentMap. It is free for non-commercial no-source use, and commercially licensed with or without source. See boilerbay.com for graphs. Full disclosure: I am the author.

AirConcurrentMap conforms to the standards so it is plug-compatible everywhere, even for a regular Map.

Iterators are already very fast especially over 1K Entries. The higher-speed scans use a 'visitor' model with a single visit(k, v) callback that reaches the speed of Java 8 parallel streams. The AirConcurrentMap parallel scan exceeds Java 8 parallel streams by about 4x. The threaded visitor adds split() and merge() methods to the single-thread visitor that remind one of map/reduce:

static class ThreadedSummingVisitor<K> extends ThreadedMapVisitor<K, Long> {
    private long sum;
    // This is idiomatic
    long getSum(VisitableMap<K, Long> map) {
        sum = 0;
        map.getVisitable().visit(this);
        return sum;
    }

    @Override
    public void visit(Object k, Long v) {
        sum += ((Long)v).longValue();
    }

    @Override
    public ThreadedMapVisitor<K, Long> split() {
        return new ThreadedSummingVisitor<K>();
    }

    @Override
    public void merge(ThreadedMapVisitor<K, Long> visitor) {
        sum += ((ThreadedSummingVisitor<K>)visitor).sum;
    }
}
...
// The threaded summer can be re-used in one line now.
long sum = new ThreadedSummingVisitor().getSum((VisitableMap)map);
  • 2
    Though you've disclosed that you are the author, you've not told us how to *use* AirConcurrentMap - some examples would make your answer much better and improve its usefulness. [From review](http://stackoverflow.com/review/low-quality-posts/11707523) – Wai Ha Lee Mar 21 '16 at 02:52
  • 1
    Thanks for the suggestion. AirConcurrentMap is a standard ConcurrentNavigableMap so it is compatible with all other Java Maps and can be plugged in by adding airconcurrentmap.jar to the CLASSPATH and changing the existing constructors. – Roger Deran Mar 22 '16 at 22:41
  • It should be noted, that "AirConcurrentMap" by "Boiler Bay" seems to be a commercial product. PLEASE READ THE LICENSES FIRST! As I understand, you have to pay an annual(!) fee in order to use it for any commercial purpose which probably includes putting a free program with it on a website with ads (or under some circumstances even without ads). – Code ninetyninepointnine May 06 '18 at 12:02