44

I'm wondering how a geospatial index, such as the one used by MongoDB, works. Can anyone explain what data structure/algorithm is used internally? What time complexity does a search run in?

Links to resources would be great too.

Scott Hyndman
  • 923
  • 1
  • 7
  • 15
  • 3
    @Will I think it would be good to re-open this question. It does actually ask for factual answers and the value of the responses so far are high. As a programmer working on databased index technology I think it would be of value to hear more answers :) – Rob Evans Sep 09 '15 at 13:39
  • @RobEvans Yeah, nope. First off, how geospatial indexing works is a subject that can't be well answered within the question/answer format of StackOverflow. Second, this question is looking for links, which is *specifically* not allowed (there's a close reason for it). If you're confused about what is and is not on topic here, please visit [meta]. –  Sep 09 '15 at 13:41
  • 3
    @Will Gotta disagree with your assertion that geo-indexes and their inner workings are not a good fit for Q&A but I suppose that is beside the point. Sure the question could do with an edit to remove link solicitation and be more specific but that doesn't mean it's not of value or a good fit generally. I see quite a few of these that are valuable and closed and that is a shame as the community would definitely find value in them... I've already found value in the two answers below as it reminded me about geohases as b-tree keys. – Rob Evans Sep 09 '15 at 14:40
  • @RobEvans Well, y'wrong. The accepted answer is nothing but links to tree structures. That's only one component in the whole, and *still* too expansive to fit here. You have to visit those links to learn about them. FYI, there are these things called *search engines* which index links already. StackOverflow isn't trying to be a link aggregation service. The second answer is just a ripoff of another question's answer (lovely) that points to another question answered partially by links. Links that rot, btw, rendering the answers useless. Again, visit [meta] to learn why SO is run this way. –  Sep 09 '15 at 15:14
  • 1
    I'm curious too and this question is exactly I wanted to ask on SO too. Can you reopen it? – Jaro Aug 24 '17 at 06:45
  • Google isn't giving good results, but the links in the answers are great. It's not like google will tell me R-Tree is a good data structure for geospatial indexing. That being said, a short explanation of why R-Tree would be/is good for geospatcial indexing will make the answer complete. – andho Aug 18 '19 at 12:53

2 Answers2

21

Depending on the data type and usage pattern, either an R-Tree or variant (R*, R+) or a quadtree or perhaps even a kd-tree.

codekaizen
  • 26,990
  • 7
  • 84
  • 140
4

According to this other SO question:

The current implementation encodes geographic hash codes atop standard MongoDB B-trees. Results of $near queries are exact. One limitation with this encoding, while fast, is that prefix lookups don't give exact results, especially around bit flip areas. MongoDB solves this by doing a grid-neighbor search after the initial prefix scan to pick up any straggler points. This generally ensures that performance remains very high while providing correct results.

Community
  • 1
  • 1
Scott
  • 16,711
  • 14
  • 75
  • 120