40

Is there a theoretical limit for the number of key entries that can be stored in a HashMap or does the maximum purely depend on the heap memory available?

Also, which data structure is the best to store a very large number of objects (say several hundred thousand objects)?

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
Ebbu Abraham
  • 2,036
  • 5
  • 22
  • 30

4 Answers4

49

Is there a theoretical limit for the number of key entries that can be stored in a HashMap or does it purely depend on the heapmemory available ?

Looking at the documentation of that class, I would say that the theoretical limit is Integer.MAX_VALUE (231-1 = 2147483647) elements.

This is because to properly implement this class, the size() method is obliged to return an int representing the number of key/value pairs.

From the documentation of HashMap.size()

Returns: the number of key-value mappings in this map

Note: This question is very similar to How many data a list can hold at the maximum.


which data structure is the best to store a very large number of objects(say several hundred thousand objects)?

I would say it depends on what you need to store and what type of access you require. All built in collections are probably well optimized for large quantities.

Community
  • 1
  • 1
aioobe
  • 413,195
  • 112
  • 811
  • 826
  • And I just noticed the size():int method :) – pgras Nov 08 '10 at 12:41
  • 5
    Actually, `size()` is not really a problem: *If the map contains more than `Integer.MAX_VALUE` elements, returns `Integer.MAX_VALUE`.* - http://download.oracle.com/javase/6/docs/api/java/util/Map.html#size() Only client code not recognizing the true meaning of `size()` should be a problem. – gustafc Nov 08 '10 at 12:50
  • 1
    @gustafc, ah, interesting point. However, the documentation of `HashMap.size()` refines this specification: http://download.oracle.com/javase/6/docs/api/java/util/HashMap.html#size%28%29 It's questionable though, whether or not that's probably a "bug" in the specification :-) – aioobe Nov 08 '10 at 12:56
  • @aioobe - well, looking at the Sunoracle implementation of `HashMap`, it doesn't really seem to grok that the `size` field can overflow. Spec compliance? :) So maybe I should've phrased that "`size()` *should* not really be a problem"... – gustafc Nov 08 '10 at 13:04
  • @gustafc, provided there is no one already, are you filing a bug-report, or shall I do it? If you do it, I'd be interested in a link to the report. – aioobe Nov 08 '10 at 13:06
  • @gustafc: Even if the definition of Map#size() allows maps to contain more than Integer.MAX_VALUE elements, a concrete implementation like HashMap does of course not have to support more than Integer.MAX_VALUE elements and it's IMHO obviously clear that HashMap does not. – jarnbjo Nov 08 '10 at 13:10
  • @jarnbjo, I've read through the source now, and I can't see there is a problem. If they just keep track of the `size`-overflow, and revise the documentation for `size()` I'd say that there wouldn't be a limitation to `Integer.MAX_VALUE`... – aioobe Nov 08 '10 at 13:14
  • @aioobe, bug filed (but not available at the time of writing): http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6998213 – gustafc Nov 08 '10 at 14:56
  • @gustafc, oh crap. I filed one too hehe :-) well that should give the issue some attention :-) – aioobe Nov 08 '10 at 16:10
  • @gustafc: They can't change the implementation now, since it would not be backwards compatible. Existing code relies on that if HashMap#size() returns Interger.MAX_VALUE, the map contains Integer.MAX_VALUE elements and not more. – jarnbjo Nov 09 '10 at 08:42
  • @jarnbjo, reading the code it looks like the size-variable would overflow... and the HashMap#size() returns the value of the size-variable. – aioobe Nov 09 '10 at 08:45
  • 1
    @jarnbjo, I'm not entirely convinced this won't be fixed - we can just as well argue that there may be code relying on HashMap correctly implementing Map. We'll see who's right when my or aioobe's bug reports are responded to. – gustafc Nov 09 '10 at 09:49
  • For the record, the bug has been re-filed [here](https://bugs.openjdk.java.net/browse/JDK-6998187). – aioobe Mar 11 '15 at 12:49
  • 230 is the max size of the HashMap but you can store more elements than the 230. If you look at the HashMap implementation then you can find the variable declaration `static final int MAXIMUM_CAPACITY = 1 << 30;` After 230 hash map will start behaving weirdly but it does allow more numbers than the 230. – Sudhir Dhumal Jun 14 '19 at 09:11
  • This thread is putting it in better way: https://stackoverflow.com/questions/19886017/how-many-elements-can-i-store-in-a-hashmap-object-in-java – Sudhir Dhumal Jun 14 '19 at 09:13
  • @SudhirDhumal, from what I can tell, you're wrong in 2 regards. 1) The question is about the general contract, and you're talking about details of a specific implementation (OpenJDK I assume?). The reasoning does not necessarily hold true for other implementations (GNU Classpath being one example) or even possible future versions of OpenJDK. The only authoritative source here is the _contract_ of the class, which is spelled out in the signature of the methods and the Javadoc. 2) The _capacity_ is the number of buckets. Each bucket can contain many _items_. – aioobe Jun 14 '19 at 13:39
  • If 2 power 31-1 is the limit of hashmap.. what will happen if i add one more entry. would i get exception? – vijaya kumar Sep 19 '21 at 16:55
  • @vijayakumar I don't think that's defined. – aioobe Sep 20 '21 at 14:59
12

HashMap holds the values in an array, which can hold up to Integer.MAX_VALUE. But this does not count collisions. Each Entry has a next field, which is also an entry. This is how collisions (two or more objects with the same hashcode) are resolved. So I wouldn't say there is any limit (apart from the available memory)

Note that if you exceed Integer.MAX_VALUE, you'll get unexpected behaviour from some methods, like size(), but get() and put() will still work. And they will work, because the hashCode() of any object will return an int, hence by definition each object will fit in the map. And then each object will collide with an existing one.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • 1
    The documentation of HashMap#size() actually limits the number of key-value mappings to a number representable by an int. If a HashMap would allow more mappings, the documentation of the size() method would have been inherited from Map#size(), which defines the method to return Integer.MAX_VALUE if the size is > Integer.MAX_VALUE. – jarnbjo Nov 08 '10 at 12:40
  • `hashCode()` returns `int`. Hence after the table is full, there will be only collisions. – Bozho Nov 08 '10 at 12:44
0

I agree with @Bozho's and will also add that you should read the Javadoc on HashMap carefully. Note how it discusses the initial capacity and load factor and how they'll affect the performance of the HashMap.

HashMap is perfectly fine for holding large sets of data (as long as you don't run out of keys or memory) but performance can be an issue.

You may need to look in to distributed caches/data grids if you find you can't manipulate the datasets you need in a single Java/JVM program.

Martijn Verburg
  • 3,287
  • 21
  • 26
0

There is no theoretical limit, but there is a limit of buckets to store different entry chains (stored under a different hashkey). Once you reach this limit every new addition will result in a hash collision -- but this is no a problem except for performance...

pgras
  • 12,614
  • 4
  • 38
  • 46