0

I need a data structure supports the following operations:

  1. Insert a number;
  2. Find the median of all inserted numbers;
  3. (additional) Find the known in advance quantile (0-1) of all inserted numbers;

The simplest way is sort numbers after each insertion, but this is not fast. Is there a faster solution?

3 Answers3

1

You can do operation (1) in O(logN) and (2) in O(1) using two binary heaps (maybe you can do operation (3) fast too, I'm just not very fluent in probability theory).

Create a max-heap L and a min-heap R. Create two variables a and b that will store one (or two) middle elements. Heap L will store all elements lesser than middle one(s), heap R will store all elements greater than middle one(s). Size of L and R will always be the same.

Insert(x):

  • If the count of elements in data structure is odd (a stores middle element, b stores nothing):
    • If x < a:
      • b = a
      • a = x
    • If x >= a:
      • b = x
  • If the count of elements in data structure is even (a and b store two middle elements):
    • If x < a:
      • Put b into R
      • Put x into L
      • b = null
    • If x > b:
      • Put x into R
      • Put a into L
      • a = b
      • b = null
    • If a <= x <= b:
      • Put a into L
      • Put b into R
      • a = x
      • b = null

GetMedian():

  • Just return a (if count of elements is odd) or (a + b) / 2 (of count of elements is even)
ardenit
  • 3,610
  • 8
  • 16
1

A balanced binary search tree can be augmented with the size of the sub-tree at each node. Maintaining this statistic is constant-time during tree rotations, and you can use it to find the ith element in O(log N) for any value of i:

  • i < size(left): recurse to the left.
  • i == size(left): return current node.
  • i > size(left): set i = i - size(left) - 1 and recurse to the right.

Such data structure would give O(log N) bound on all three operations.

Yakov Galka
  • 70,775
  • 16
  • 139
  • 220
0

Well, if you initially know that the contents of the array start out sorted, you could use this property to conduct a binary search to find whether a key exists and if not where it should be inserted. You could then insert the value at that point, and the data will continue to remain "sorted." Kinda old-school, yes, but it would be efficient.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41