1

I'm not particularly knowledgeable about MYSQL queries and optimising them, so I require a bit of help on this one. I'm checking a table of international cities to find the 10 nearest cities based on the longitude and latitude values in the table.

The query I'm using for this is as follows:

SELECT City as city,
       SQRT(POW(69.1 * (Latitude - 51.5073509), 2) +
           POW(69.1 * (-0.1277583 - Longitude) * COS(Latitude / 57.3), 2)) AS distance
from `cities`
group by `City`
having distance < 50
order by `distance` asc
limit 10

(The longitude & latitude values are obviously placed dynamically in my code)

sometimes this can take around 3-4 mintues of my development environment to complete.

Have I made any classic mistakes here, or is there a much better query I should be using to retrieve this data?

Any help woould be greatly appreciated.

jarlh
  • 42,561
  • 8
  • 45
  • 63
ST-NEIL
  • 29
  • 2
  • 6
  • Could you provide an execution plan? – Jester Nov 15 '16 at 09:53
  • Calculations in the select are generally slow so I think that is your problem – Jester Nov 15 '16 at 09:54
  • see http://stackoverflow.com/a/38771805/267540 and http://stackoverflow.com/a/38548557/267540 – e4c5 Nov 15 '16 at 09:55
  • have you tried performing the calculations for the distance outside of the query? – barudo Nov 15 '16 at 09:55
  • 2
    You can add filter to restrict rectangle first WHERE abs(Latitude-51.5073509) – StanislavL Nov 15 '16 at 09:55
  • 2
    The query is false. Putting distance in the HAVING clause makes no sense. What exactly are you trying to achieve? – David דודו Markovitz Nov 15 '16 at 09:55
  • avoid `having` clause... – Vikrant Nov 15 '16 at 09:58
  • The **alias** *distance* isn't recognizable in that where clause; hence use of the having clause. Agree this isn't optimal, but there is a reason, thus I wouldn't go so far as to say the existing query "is false" – Paul Maxwell Nov 15 '16 at 10:16
  • How many cities you are checking? The whole world? If yes, why don't you restrict this by an initial filter (say, the city's longitude must be within a certain difference from your center point, and the same for the latitude). This way you will only perform your calculation on a **drastically** reduced set of candidates. – FDavidov Nov 15 '16 at 10:17
  • @Used_By_Already, **(1)** How many rows are in each `City` group? multiple rows? a single row? **(2)** If there are multiple rows in the group, what exactly is being tested with `having distance < 50`? **(3)** How does MySQL group by? In what complexity? – David דודו Markovitz Nov 15 '16 at 10:45
  • If `sqrt(something) < 50`, then `something < 50^2`; that saves you the square root. But as others stated, calculations like these are slow. – mrjink Nov 15 '16 at 12:16

1 Answers1

0

Assuming City is unique and you are abusing GROUP BY and HAVING in order to get a cleaner code

SELECT City as city,
       SQRT(POW(69.1 * (Latitude - 51.5073509), 2) +
           POW(69.1 * (-0.1277583 - Longitude) * COS(Latitude / 57.3), 2)) AS distance

from `cities`

where  SQRT(POW(69.1 * (Latitude - 51.5073509), 2) +
           POW(69.1 * (-0.1277583 - Longitude) * COS(Latitude / 57.3), 2))  < 50

order by `distance` asc

limit 10

  • If City is unique then the aggregation is done on single rows.
    MySQL uses sort operation to implement GROUP BY.
    Sort complexity is O(n*log(n)), so without indexes this is going to complexity of GROUP BY.
  • If City is not unique than the filtering in the HAVING CLAUSE is done on one arbitrary row which is for sure not what the OP intended.

The case where HAVING and WHERE are both relevant for filtering and HAVING has an performance advantage is where the filtering is done on the aggregated column, there are some heavy calculations and the GROUP BY operation significantly reduce the number of rows

select x,... from ... group by x having ... some heavy calculations on x ...
David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88
  • It may appear that the complex predicate is inefficient in the where clause (and that is why group by/having were used) perhaps you could elaborate on that point? – Paul Maxwell Nov 15 '16 at 11:32