What is the maximum size of hashmap/map object in c++ and java? I want to use hashmap, but i am working on huge data. I am worrying if i use this on large data, it may crash because of its capacity limit. Is that so? If so, what can be the alternative way?
-
6Have you considered using a database? – Marcelo Dec 20 '11 at 17:59
9 Answers
In Java, the size()
of a HashMap
is of type int
, so there's an upper bound of 2^31-1 elements in the map.
In C++, map::max_size
returns the max. number of elements. In a vanilla map
, there's an upper bound of at most SIZE_T_MAX
elements, which is 2^64-1 on modern hardware.

- 355,277
- 75
- 744
- 836
In C++, std::map
has a max_size()
member function (corresponding to the amount of data it can hold).
sizeof(std::map<...>)
will give you the size of the actual object (corresponding to the size of the actual object, not the data it holds).

- 51,882
- 13
- 139
- 180
-
... but the "size of the actual object" doesn't really mean anything; it's a very minimal lower bound on the actual memory use, only to be used by allocators. – Fred Foo Dec 20 '11 at 18:00
-
Neither of those expressions will report the actual memory used by the entire map. – Drew Dormann Dec 20 '11 at 18:00
-
@Drew, no, but the first one answers precisely what the OP was asking. – Daniel Daranas Dec 20 '11 at 18:01
std::map and hashmap are dynamic structures. They grow as elements are added, until the system is able to provide memory for them.
The max_size() member function gives the upper limit the class implementation (in code) is able to sustain, but that limit is normally wider than the system capacity the code itself run onto.
The system available memory depends also on what else the system is doing other than running your application.
You can empirically come to a reasonable number by querying the OS about the amount of free memory it can give to your process and divide it for the size of an element as "key plus value plus some overhead (usually 20 / 24 bytes)".

- 20,229
- 2
- 46
- 63
For Java:
HashMap has an underlying store is an array which is always a power of 2 in size. The largest it can be is 2^30. With a default load factor of 0.75 it will try to grow and fail at around 750 million entries.
TreeMap is not limited and can have more than 2^31 entries (however the size() will return MAX_VALUE) Similarly for ConcurrentSkipList and ConcurrentHashMap.

- 525,659
- 79
- 751
- 1,130
Some information to keep in mind (the big picture):
If your data is huge you can't hold it in memory. You have to go to secondary storage: HDD. When you go to HDD you lose the speed optimizations of a hashmap. Every time you go to the HDD you incur a delay (seek time and such). Searching a hashmap stored on disk becomes linear time.
What I'm trying to say is that a map is useless if your data can't fit in memory.
A better solution is to index your data. Store the indices in memory, and have a pointer to where on disk that data you're looking for is. Retrieve the data from disk.
Improve this model further by using RAID for storage. Also going to DB results in the same delay as going to HDD.
I suggest you store all the values in a DB, and keep an in-memory dictionary with hashes as keys.

- 5,603
- 8
- 53
- 85
In Java, the size of Hashmap is bounded by the JVM memory. It can grow in size. There is no hard limit, as far as I know.
Don't know about C++.

- 18,329
- 31
- 104
- 137
-
4There is a hard limit: the maximum value of `int`, since that's the return type of `size()`. – Fred Foo Dec 20 '11 at 18:02
There isn't a maximum size explicitly - it depends on your platform and the implementation of your STL. For example, if you have highly fragmented memory and the implementation uses a contiguous buffer (which I doubt since usually only vectors do that) then you may run out of space long before your computer's memory is exhausted.
Alternatively, if small blocks are allocated as the container expands in the implementation, your memory limit is a combination of the memory your computer has, and the limits you've set up within your OS (if ulimit happens to be set in Linux or whatever the Windows variant of that is).
The class does have a max_size() member function, but if you haven't set that it shouldn't affect you. So, simple answer - there isn't a limit except those that are dependent on your own computer and OS.

- 37,047
- 37
- 155
- 255
You are effectively going to be limited by the memory capacity of your system.
If you are working with huge data, consider where this huge data is coming from. And design your map in a way that leaves the huge data where it already is.

- 59,987
- 13
- 123
- 180
Java or C++ itself is not a limit. In practice you are limited only by resources.
Depending from your requirements approaches could be:
- more compact structures like Patricia trie
- database solutions or file based Maps
- distributed DHT based solutions
Try to look here for some tips.