I have a dictionary as a text file mapping from 2M
words to 50k
words. I load this file into memory as HashMap<String, String>
by reading the file line by line, splitting on a separator and invoking myMap.put(line[0], line[1])
. The size of the text file is 45MB
, while the HashMap uses 350MB
of the heap. My goal is to reduce memory usage without harming lookup speed.
myMap.values().size()
returns 2M
instead of 50k
, suggesting that the values are stored as duplicates. Is there a way to make identical values point to the same String object?
Map<String, String> dict = new HashMap<>();
try (FileReader fr = new FileReader(FILE);
BufferedReader br = new BufferedReader(fr)) {
String line;
while ((line = br.readLine()) != null) {
String key_value[] = line.split(":");
dict.put(key_value[0], key_value[1].intern());
}
} catch (Exception e) {
e.printStackTrace();
}