Given a sequence of data (it may have duplicates), a fixed-sized moving window, move the window at each iteration from the start of the data sequence, such that (1) the oldest data element is removed from the window and a new data element is pushed into the window (2) find the median of the data inside the window at each moving.
The following posts are not helpful.
Effectively to find the median value of a random sequence
joining data based on a moving time window in R
My idea:
Use 2 heaps to hold median. In side the window, sort the data in the window in the first iteration, the min heap holds the larger part and the max heap holds the smaller part. If the window has odd number of data, the max heap returns the median otherwise the arithmetic mean of the top elements of the two heaps is the median.
When a new data is pushed in to the window, remove the oldest data from one of the heap and compare the new data with the top of max and min heap so that to decide which heap the data to be put. Then, find the median just like in the first iteration.
But, how to find a data element in a heap is a problem. Heap is a binary tree not a binary search tree.
Is it possible to solve it with O(n) or O(n * lg m) where m is the window size and space: O(1) ?
Any help is really appreciated.
Thanks