0

I have a query like this :

SELECT *, (
        6371 * acos (
            cos ( radians(33.577718) )
            * cos( radians( `Latitude` ) )
            * cos( radians( `Longitude` ) - radians(115.846524) )
            + sin ( radians(33.577718) )
            * sin( radians( `Latitude` ) )
        )
    ) AS `distance`
FROM `geopc_cn_places_grouped`
WHERE `Latitude`!=33.577718 AND `Longitude`!=115.846524
HAVING `distance` < 200
ORDER BY `distance` ASC
LIMIT 30;

The query execution is always somewhere between 3.5 and 4 seconds.

I have applied a composite index to Latitude and Longitude by running ALTER TABLE geopc_cn_places_grouped ADD INDEX index_Longitude_Latitude(Longitude, Latitude);, but it doesn't reduce the execution time.

I want to know why it's running slow and what possible optimizations can be done.

The slow query log message shows this

Slow Query Log

and this is the EXPLAIN SELECT query

EXPLAIN SELECT

Table Structure...

Table Structure

and lastly, here is the table index list

Table Indexes

O. Jones
  • 103,626
  • 17
  • 118
  • 172
Budianto IP
  • 1,118
  • 1
  • 13
  • 27

2 Answers2

1

Your query as written isn't sargable. That is, it cannot exploit any index. So, each time you run it, you use that big spherical cosine law formula for every row in your table. It's a full table scan. It's likely that most of your slowness comes from the table scan, because modern computers do the math pretty quickly once they have the data in RAM.

But, you're in luck. Your search looks for points within a 200 statute mile radius of your candidate point. That means you can usea WHERE ... BETWEEN clause to eliminate points that are more than 200 miles south or north (latitude) of your starting point.

To do this you need to know there are 69.0 statute miles, 60 nautical miles, and 111.045 km in each degree of latitude. Therefore you should search for point ± (200/69) So.... try a query like this.

SELECT *, (
        6371 * acos (
            cos ( radians(33.577718) )
            * cos( radians( `Latitude` ) )
            * cos( radians( `Longitude` ) - radians(115.846524) )
            + sin ( radians(33.577718) )
            * sin( radians( `Latitude` ) )
        )
    ) AS `distance`
FROM `geopc_cn_places_grouped`
WHERE `Latitude`!=33.577718 AND `Longitude`!=115.846524
  AND Latitude BETWEEN 33.577718 - (200/69) AND 33.577718 + (200/69)
HAVING `distance` < 200
ORDER BY `distance` ASC
LIMIT 30;

Then create an index on your Latitude column.

CREATE INDEX latsearch ON geopc_cn_places_grouped(Latitude);

The Latitude BETWEEN clause I suggest will then do an index range scan and so skip many of the rows in your table. That's the classic SQL way of making queries faster.

This is a simplification of the ideal answer to this question. I wrote up this problem here.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • You're a genius! The execution time is now cut into half, I'm completely satisfied with the result, superb, thanks! – Budianto IP Jan 02 '21 at 18:02
0

Your query must compute the distance for every row. The quick solution is to use a "bounding box". This limits the number of rows to test to a latitude stripe or longitude stripe.

Details (and more advanced speedups): http://mysql.rjweb.org/doc.php/find_nearest_in_mysql

Rick James
  • 135,179
  • 13
  • 127
  • 222