8

I have a table called place:

id | name       | coordinates (longitude, latitude)
1  | London     | -0.12574, 51.50853
2  | Manchester | -2.25, 53.41667
3  | Glasgow    | -4.25, 55.86667

The coordinates column is of the point datatype. I inserted the points into the place table using:

st_geomfromtext('point($longitude $latitude)', 4326)

Notice that I've made use of the SRID.

Given any coordinates, I'd like to find the nearest places to it (order by ascending). The solution I have currently come up with (by reading the MySQL docs) looks like this:

select
    *,
    st_distance_sphere(`place`.`coordinates`, st_geomfromtext('Point($longitude $latitude)', 4326)) as distance
from place
order by distance asc;

After looking at countless similar questions on here and elsewhere, it's quite obvious that this is a lesser known (and newer way) of doing things so there's not much content on this, hence why I'm looking for a bit of clarification.

My questions are:

  1. Is this the best solution / Am I doing this right?
  2. Will this method make use of the spatial index I have on the coordinates column?
  3. When using st_distance_sphere, do I need to specify the radius of the Earth in order to get accurate results? (Edit: no, it uses the radius of the earth by default)

Edit, here are those answers:

explain select ...; returns:

id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra
1  | SIMPLE      | place | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 115687 | 100.00   | Using filesort

flush status; select ...; show session status like 'Handler%'; returns:

Variable_name              | Value
Handler_commit             | 1
Handler_delete             | 0
Handler_discover           | 0
Handler_external_lock      | 2
Handler_mrr_init           | 0
Handler_prepare            | 0
Handler_read_first         | 1
Handler_read_key           | 1001
Handler_read_last          | 0
Handler_read_next          | 0
Handler_read_prev          | 0
Handler_read_rnd           | 1000
Handler_read_rnd_next      | 119395
Handler_rollback           | 0
Handler_savepoint          | 0
Handler_savepoint_rollback | 0
Handler_update             | 0
Handler_write              | 0
Rick James
  • 135,179
  • 13
  • 127
  • 222
Matt Kieran
  • 4,174
  • 1
  • 17
  • 17
  • https://stackoverflow.com/questions/39338167/sql-query-of-haversine-formula-in-sql-server#answer-39338205 – ggdx May 20 '18 at 19:00
  • That question is for sql-server and the answers either reflect that or recommend using the haversine formula. – Matt Kieran May 20 '18 at 19:04
  • Nearest places using lat/long pretty much require the haversine formula. SQL Server pretty much is the same as Mysql in terms of syntax. – ggdx May 20 '18 at 19:20
  • Unfortunately sql-server is very different in this instance. Years ago that was true, haversine was the only way to go but if you do some digging, you'll see a whole new slew of functions have been introduced to MySQL that help with storing and querying spatial points/polygons/etc. – Matt Kieran May 20 '18 at 19:27
  • What version of MySQL do you use? This is relevant because newer versions offer more complete `st_...` functions. – O. Jones May 20 '18 at 22:49
  • @O. Jones Thanks for pointing that out. I'm using the latest version, MySQL 8.0.11. – Matt Kieran May 20 '18 at 23:17
  • @MattKieran you are probably among the first SO people to use the MySQL 8 geospatial stuff. – O. Jones May 21 '18 at 11:22
  • MySQL 5.7.6 added `ST_Distance_Sphere()`. – Rick James May 23 '18 at 21:07
  • @MattKieran - The Show, etc, belong to you; moving them out of my Answer. – Rick James May 25 '18 at 15:39

1 Answers1

1

It may be the best solution. Let's get some other answers first...

What does EXPLAIN SELECT ... say? (This may answer your Q2).

Your query will scan the entire table, regardless of the other answers. Perhaps you want LIMIT ... on the end?

Another thing that might be useful (depending on your app and on the Optimizer): Add a bounding box to the WHERE clause.

In any case, do the following to get an accurate feel for how many rows are actually touched:

FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';

Get back with those answers; then perhaps we can iterate further.

After SHOW STATUS

Well, Handler_read_rnd_next says that it was a full table scan. The 1000 and 1001 -- did you have LIMIT 1000?

I deduce that LIMIT is not factored into how SPATIAL works. That is, it does the simplistic thing -- (1) check all rows, (2) sort, (3) LIMIT.

So, what to do?

Plan A: Decide that you don't want results farther than X miles (km) and add a "bounding box" to the query.

Plan B: Abandon Spatial and dig into a more complex way to do the task: http://mysql.rjweb.org/doc.php/latlng

Rick James
  • 135,179
  • 13
  • 127
  • 222