0

The following query is overloading my system. It seems to be a problem with the rand(). I have seen other posts dealing with similar issues, but can't quite get them working in this problem. It is being run on a 10M+ row table. I know that the order by rand() is the issue, but after reading there seems to be an issue of the autoincrement (items.ID) increments by 2 not 1.

SELECT stores.phone, stores.storeID, stores.name, stores.ZIP,
       stores.state,stores.city, storeID, GEOCODES.lon, GEOCODES.lat
FROM items
LEFT JOIN stores on stores.storeID = items.store_ID
LEFT JOIN GEOCODES on GEOCODES.address = CONCAT(stores.address1,', ',stores.ZIP)
WHERE stores.phone IS NOT NULL
GROUP BY items.store_ID
ORDER BY RAND( ) 
LIMIT 200

The other article that I was trying to follow was How can i optimize MySQL's ORDER BY RAND() function?, but can't seem to figure out how to adapt it to this query. Please note that this is done in PHP.

Community
  • 1
  • 1
user2694306
  • 3,832
  • 10
  • 47
  • 95

2 Answers2

0

Using RAND() in your query has serious performance implications, avoiding it will speed up your query a lot.

Also since you're using php, randomizing the ordering using shuffle() w/ php may be a significantly quicker alternative to mysql.

See: http://php.net/manual/en/function.shuffle.php

AnchovyLegend
  • 12,139
  • 38
  • 147
  • 231
  • How can you be so sure about the statement `using UNION instead of LEFT JOIN will speed up your query a lot`? – Rahul Nov 12 '14 at 21:57
0

if I were you I would LIMIT first and then ORDER BY RAND() on the limited query.. that way you arent pulling everything out and randomizing it.. I have used this exact method to speed up my queries exponentially

SELECT *
FROM
(   SELECT stores.phone, stores.storeID, stores.name, stores.ZIP,
           stores.state,stores.city, storeID, GEOCODES.lon, GEOCODES.lat
    FROM items
    LEFT JOIN stores on stores.storeID = items.store_ID
    LEFT JOIN GEOCODES on GEOCODES.address = CONCAT(stores.address1,', ',stores.ZIP)
    WHERE stores.phone IS NOT NULL
    GROUP BY items.store_ID
    LIMIT 200
) t
ORDER BY RAND( ) 

Some proof:

CREATE table digits as (-- a digit table with 1 million rows)   
    1000000 row(s) affected Records: 1000000  Duplicates: 0  Warnings: 0    
    1.869 sec

SELECT * FROM digits ORDER BY RAND() LIMIT 200  
    200 row(s) returned 
    0.465 sec / 0.000 sec

SELECT * FROM (SELECT * FROM digits LIMIT 200)t ORDER BY RAND()    
    200 row(s) returned 
    0.000 sec / 0.000 sec
John Ruddell
  • 25,283
  • 6
  • 57
  • 86
  • The concept makes sense, but one question. Wouldn't this just select the first 200 items in the table, resulting in a distorted distribution of the results? – user2694306 Nov 12 '14 at 23:20
  • @user2694306 it could select that.. but the selector is free to choose which rows to select without a limit offset... so it could select others. id recommend you try it out! also you can try randomizing it in php with shuffle() and can then pull out the first 200 records in your foreach loop as a side note – John Ruddell Nov 12 '14 at 23:25