I'm very new to Java so forgive me if I'm doing something terribly wrong.
I'm working on a project where I need to quickly scan a very large volume of data (CSV with 50 million lines or more, 5 entries per line) for repeats. I've resorted to using a HashMap
, since its .contains()
method is fast.
However, I end up having to store a million keys or more in the map. Each key is associated with an int[] array, which would have from 1 to 100 entries as well. So obviously, I end up getting an OutOfMemory
error unless I'm using a laptop with ~16 GB of RAM.
I was thinking that once the HashMap
gets more than N keys or a key gets more than N entries, I could write it somewhere and clear it. However, not all keys or values are found at once, so I need to be able to add to the written hashmap, and not overwrite it.
I've searched far and wide and still can't find a way to do it, so thanks a lot to whoever can help!