0

I am interested in knowing the fastest way to compute percentile of elements in a collection. The collection is dynamic - elements can be added and removed, and values of elements can change over time. A real-world example is reputation of SO users. What is the fastest way to compute the X in top X percent of each user?

morpheus
  • 18,676
  • 24
  • 96
  • 159
  • Sounds very similar to "[Fast algorithm for repeated calculation of percentile?](http://stackoverflow.com/questions/3738349/fast-algorithm-for-repeated-calculation-of-percentile)" – martinus Jun 12 '11 at 15:26

1 Answers1

3

If you want an in memory data structure, you want an order statistics tree, which is an augmented version of a binary search tree. This supports finding the Nth value in sorted order, insertion, and deletion all in O(log(n)) time.

If you are using an SQL database, keeping an index on the particular column and using a top percent query should be efficient.

Rob Neuhaus
  • 9,190
  • 3
  • 28
  • 37