Summary:
In this scenario the load factor might seem interesting, but it can't be the underlying cause of your OOMEs since the load factor only controls the wasted backing array space, and by default (load factor of 0.75) that only consumes ~2.5% of your heap (and doesn't cause high-object count GC-pressure). More likely, the space used by your stored objects and their associated HashMap.Entry
objects has consumed the heap.
Details:
The load factor for a HashMap
controls the size of the underlying array of references used by the map. A smaller load factor means fewer empty array elements at a given size. So in general, increasing the load factor results in less memory use, since there are fewer empty array slots.3
That established, however, it is unlikely you can solve your OOMEs by adjusting the load factor. An empty array element, however, only "wastes" 4 bytes1. So for an array of 5M-10M elements, a load factor of 0.75 (the default), will waste something like 25 MB of memory2.
That's only a small fraction of the 1,024 MB of heap memory you are allocating, so you aren't going to be able to solve your OOMEs by adjusting your load factor (unless you were using something very silly, like an extremely low load factor of 0.05 or something). The default load factor will be fine.
Most likely it is actual size of the objects and object Entry
s stored in the HashMap
that is causing the problem. Each mapping has a HashMap.Entry
object that holds the key/value pair and a couple of other fields (eg the hashcode, and a pointer to the next item when chained). This Entry
object itself consumes about 32 bytes - when added to the 4 bytes for the underlying array entry, that's 40 bytes * 10M entries = 400M
of heap for the overhead of the entries alone. Then the actual objects you are storing take space too: if your object has even a handful of fields, they will be at least as large as the Entry
objects and your heap is pretty much exhausted.
The fact that you are getting a GC limit exceeded
error rather than a heap alloc failed
generally means you are approaching the heap limit slowly, churning a lot of objects: the GC tends to fail in that way in that scenario, before running out of space.
So most likely you simply need to allocate more heap to your application, find a way of storing fewer elements, or reduce the per-element size (e.g., with a different data structure or object representation).
[1] Usually 4 bytes on HotSpot anyway, even when running the 64-bit JDK - although it may be 8 bytes on some 64-bit platforms if compressed oops is disabled for some reason.
[2] Worst case, 0.75 load factor means a load of 0.75 / 2 = 0.375
after resize, so you have (1 - 0.375) * 10,000,000
empty elements, at 4 bytes per element = ~25 MB. During rehash you could add another factor of 1.5 or so, in the worst case, since both the old and new backing arrays will be on the heap simultaneously. When the map sizes stabilizes though, this doesn't apply.
[3] This is true even with chaining, since in general the use of chaining doesn't increase memory use (i.e., the Entry
elements already have the "next" pointer embedded regardless if the element is in the chain or not). Java 8 complicates things since the HashMap
implement was improved such that large chains may be converted into trees, which may increase the footprint.