0

I have a MongoDB collection, sorted by a field "a". Now, suppose I want to find the amount of objects in the collection for which the field "a" is between x and y. If I understand correctly, the way the naive query will be executed is simply iterating the array and checking all elements, which is O(n) complexity.

But since it is a sorted array, the operation is possible in O(log(n)). What is the fastest way to perform that query?

matanso
  • 1,284
  • 1
  • 10
  • 17
  • "*But since it is a sorted array, the operation is possible in O(1).*" Would you mind sharing the source of this knowledge or an example? I see it possible only in *O(N)* or *O(ln N)* with some nifty optimizations. I would start with profiling normal LINQ's `Count` with predicate, and see if it is good enough. – luk32 Feb 01 '16 at 11:57
  • You're absolutely right. It's possible in O(log(n)), With 2 binary searches. I'll correct my question in a second. – matanso Feb 01 '16 at 11:59
  • Not really. You need to count elements between those two. I simplified it. It's *O(2*ln N + distance(x,y))*, and by `distance(x,y)` I mean distance between the appropriate objects it doesn't necessarily scale with `x` or `y`. And by optimizations I mean some smart caching of index, so that an object knows it's order, so a real count is not needed, but I don't see it too feasible. Anyway's I'd check if MongoDB's LINQ engine is smart enough to optimize the predicate in `Count` to binary searches. – luk32 Feb 01 '16 at 12:10
  • No, you don't have to iterate through all of the elements between them, only find the first elements greater than X and the last one smaller than Y. – matanso Feb 01 '16 at 12:18
  • You need to, you said you want to get amount. I.e. to count them. It's a funny thing that finding bounds of the subset is easy, but saying how many element is there is not. This is also a reason why C# implementation of [SortedSet.GetViewBetween](https://msdn.microsoft.com/en-us/library/dd381801(v=vs.110).aspx) has poor performance, there is a rule, that `Count` property must be *O(1)*, so upon creation of the view it count's it's elements and for large spanning views the performance drops to *O(N)*. More: http://stackoverflow.com/questions/9850975/why-sortedsett-getviewbetween-isnt-olog-n – luk32 Feb 01 '16 at 12:18

0 Answers0