2

I have about 100,000 merchants stored in my database, the table contains their Name, Lat, Lon

Right now my search function takes a lat/lon input and finds the stores within 10 miles radius. However, a lot of the 100,000 merchants are franchises (ie. groups of associated stores). For the franchises, I'd like to show only the Closest store, with the rest stores of the same franchise hidden (if there are multiple locations within 10 miles radius).

I've tried doing:

SELECT SQL_CALC_FOUND_ROWS *, xxxxx as Distance
FROM table
WHERE isActive = 1
GROUP BY Name
HAVING Distance < 10
ORDER BY Distance ASC

xxxxx is the function that calculates distance based on input lat/lon:

((ACOS(SIN('$lat'*PI()/180)*SIN(`Latitude`*PI()/180) + COS('$lat'*PI()/180)*COS(`Latitude`*PI()/180)*COS(('$long'-`Longitude`)*PI()/180))*180/PI())*60*1.1515)

However it's not returning the correct results. I'm getting significantly less results regardless of franchised or unfranchised stores when comparing with the same query without the "GROUP BY" clause. I wonder what's the problem?

Also, the speed is really slow. I have Name column indexed. But I think the "GROUP BY Name" is the bottleneck since MySQL must be doing a lot of string comparison? Assuming GROUP BY bug can be fixed, I'm wondering what are my options to make this faster. Is it worth while to setup a "Business_Group" column and pre-process the stores so the franchised stores would be assigned a Business_Group ID, this way GROUP BY would be faster since it's comparing int?

James Gu
  • 1,382
  • 4
  • 26
  • 39
  • 1
    Seems you are looking for MySQL spatial index http://stackoverflow.com/questions/159255/what-is-the-ideal-data-type-for-latitude-longitude http://stackoverflow.com/questions/4726031/mysql-latitude-and-longitude-table-setup – vearutop Jun 27 '12 at 14:32
  • I think the distance calculation is fine (probably not as good as having spatial index setup but it is good for our purpose). Before adding the GROUP BY clause it was really fast and returns the correct results within 10miles. Now I just want to limit the Franchised stores to show only one location, so I tried adding GROUP BY Name, which the result is missing stores that shouldn't be missing and also much slower – James Gu Jun 27 '12 at 14:36
  • Show fast query, before adding `GROUP BY` – vearutop Jun 27 '12 at 14:40
  • SELECT SQL_CALC_FOUND_ROWS *, xxxxx as Distance FROM table WHERE isActive = 1 HAVING Distance < 10 ORDER BY Distance ASC – James Gu Jun 27 '12 at 14:43

2 Answers2

0

Try to group derived table

SELECT * FROM (
    SELECT *, xxxxx as Distance FROM table WHERE isActive = 1 
    HAVING Distance < 10 ORDER BY Distance ASC
) AS all
GROUP BY Name

In not grouped query HAVING applies to every record as instant filter, and with GROUP BY it applies to whole groups, so groups are first filled with all data and then filtered with HAVING.

vearutop
  • 3,924
  • 24
  • 41
0

Make a virtual table using views of calculated column xxxxx

and use join of table and view.

That will be faster and optimized.

Lucifer
  • 264
  • 1
  • 5
  • 12