Interview prep qn: Given a long list of words, return a list of distinct words along with the count of how many occurred using only 16gb of memory.
I am thinking of using a HashMap which stores only one copy of the word and then subsequently increments the value if the same word is encountered again. This is the only approach I can think of. Overall complexity would be O(n) as I need to traverse through the entire list of words once to populate my hashmap with the word and/or increment its count.
What I am unsure about is how to factor in the 16GB of memory fact into my answer?
If each word is 100 bytes (which likely is not as word length can vary), then my hashmap can be of x number of words long, so what? How can I logically arrive at the limit of my solution? I'm a little unsure of how to proceed here..