1

I am in the process of migrating an app from Solr to Elasticsearch.

This app provides reverse geolocalisation : Returns the N nearest points for a given set of coordinates.

In Solr, this is largely optimized by using a bounding_box before sorting the results foudn inside that box (from 1-2 seconds to 50ms), this discussion on github explains it deeper.

I want to achieve the same in ES because simple geo_distance filter also gives me slow results in ES.

When Solr ask me for a central point and a radius (eg. 5Km), ES needs the coordinates of the bounding_box.

It is way more complex to compute those values than just give a radius and let some robust tool compute the bounding_box it self.

I am wondering if the calculation of bouding_box is made by Solr or by Lucene. In the later case I could expect ES to provide access to this feature, but I can't find any thing in official documentation or by googling.

I would like to avoid the overhead of implementing complex maths in my app if I can leverage Elasticsearch or Lucene codebase. Moreover it would probably be done faster in Java than in my nodejs app.

I would take any advices, before making a feature request on ES and start reinveting the wheel.

Community
  • 1
  • 1
g4vroche
  • 527
  • 3
  • 11
  • Have you considered [geo_distance_range](http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/query-dsl-geo-distance-range-filter.html) filter? – Andrei Stefan Sep 25 '14 at 11:51
  • 1
    Or [geo_distance](http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/query-dsl-geo-distance-filter.html) (even better) filter? – Andrei Stefan Sep 25 '14 at 11:54
  • Yes, I already have this implemented, but its slow. I thus thought it was for the same reason than Solr was slow via the geodit() function (Edited my question to precise that this is a performance issue and to link to explanation of the optimisation). – g4vroche Sep 25 '14 at 13:06
  • With `geo_distance`, ES uses a bounding box calculation to exclude as many documents as it can and it only runs the geo-distance calculations for those matches that fall within the box. As a slight improvement, you could try to set the `distance_type` property for `geo_distance` to "plane" which is a faster calculation formula than the default `sloppy_arc` one. But, of course, it's also less accurate. – Andrei Stefan Sep 25 '14 at 13:16

0 Answers0