let me just say this is a HW project from UC Berkeley, so If you could give me a hint instead of the solution, It'd be great.
The problem is to create a data structure called "YearlyRecord" that maps a word to the number of times it is encountered in an ngram file. Here are the restrictions.
- getting the count must be O(1)
- getting the number of mappings must be O(1)
- "put" must be on average O(log n)
- getting the "rank" must be O(1)
- counts, which provides a data structure of all of counts in increasing order
- words, which provides a data structure of all words, in the order of counts
Note: "rank" is a function that takes a word and tells me its rank (e.g. 1
, 2
, 3
, 4
), which is based on the number of occurrences we have of it.
My solutions so far:
- Hashmap (word -> count) provides O(1) getCount operation and O(1) size operation
- make a private inner class that implements Comparator which compares based on the number of occurrences in the hashmap. Then use this comparator as input to a TreeSet of the words. This provides the "words" collection, which we can return in O(1) time.
- Make another TreeSet of the counts. This provides an O(1) return of the counts in increasing order.
- rank: Here is where I am stuck. It is clear that rank should be a map as well. I have the words in the TreeSet in increasing order of rank, but no indices to map to.