Give a sequence of data (with duplicates), move a fix-sized window along the data sequence and find mode in the window at each iteraion, where the oldest data is removed and a new data is inserted to the window.
I cannot find better solutions here.
My idea: Use a hashtable, key is the data, key's data is the frequency of the data occuring in the window.
At the first iteration, iterate each data in the window and put it to the hashtable, meanwhile cout the frequency of each data. After that, traverse the hashtable and return the data with the highest frequency. For each following iteration, search the oldest data in the hashtable and reduce its frequency by 1 if it is in the hahstable if it becoms 0 use new data to replace the old one. Otherwise, just insert the new data into the hahstable. Traverse the table and return the mode.
It is O(n * m) where n is data seq size and m is window size. The drawback is : The hashtable size is not fixed, it may have resize overhead. Each iteration, the table has to be traversed, it is not effcient.
Is it possble to do it with O(n lg m) or O(n) ?
Any help is appreciated.
thanks
Another solution: At the first iteration, build up a hashtable with data as key and its frequency as value associated with the key. On the base of the hashtable, build up a multimap with frequency as key and associated data as value.
After that, at each iteration, in the window, remove the oldest data and update the hashtable, and then update the multimap with the newest updated one in hashtable. If the map key has multiple data, replace it the new one with only the data whose frequency not changed. But, add a new pair with the new frequency and data.
In the window, get a new data and also update the hashtable, update the multimap with the newest updated one in hashtable.
The entry located at the most right hand side on the multimap (a binary search tree) is the mode because its value is the highest frequency in the current window.
Time O(m + m * lg m + n * lg m) if n >> m, O(n lg m). Space : O(m)
Any better idea ?