I have a program working on enormous data sets. The objects are best stored on hash implemented containers since the program keeps seeking for objects in the container.
The first idea was to use HashMap since the methods get and remove of this container are more suitable to the uses I need.
But, I came to see the use of HashMap is pretty memory consumable which is a major problem, so i thought switching to HashSet will be better because it only uses <E>
, and not <K,V>
per element, but when I looked at the implementation i learned it uses an underlying HashMap! this means it wont save any memory!
So this is my questions:
- Are all my assumptions true?
- Is HashMap memory wasteful? more specifically, what is its overhead for each entry?
- Is HashSet just as wasteful as HashMap?
Is there any other Hash based containers which will be significantly less memory consumables?
update
As requested in the comments I will extend a bit on my program, the hashMap is meant to hold a pair of other objects, and some numeric value - a float- calculated from them. along the way it extracts some of them and enters new pairs. Given a pair it needs to ensure it doesnt hold this pair or to remove it. The mapping can be done using the float value or the hashCode
of the pair object.
Additionally when i say "enormous data sets" I am talking about ~ 4*10^9 objects