14

Is it true that ORDER BY rand() performance is very slow compared to other solutions? If yes, what are better ways to select random row(s) from the database?

My query:

SELECT sName FROM bpoint WHERE placeID=? ORDER BY rand() LIMIT 1; 
Biker John
  • 2,621
  • 10
  • 33
  • 52
  • 3
    It depends on how much data there is. How big table are we talking about? – bestprogrammerintheworld Apr 29 '13 at 22:33
  • It also depends on what, exactly, you mean by "random": do you require every record be selected with equal probability? Or is a perfectly uniform distribution not necessary? – eggyal Apr 29 '13 at 22:34
  • 5-10.000 rows. Equal probability desired. – Biker John Apr 29 '13 at 22:37
  • 1
    If you know that an indexed column contains values ranging from `i` to `j`, you could filter `WHERE column >= FLOOR(i + RAND() * (j – i)) ORDER BY column LIMIT 1`. But if there are gaps, you won't obtain a perfectly uniform distribution of probabilities... – eggyal Apr 29 '13 at 22:42

1 Answers1

14

Yes, ORDER BY RAND() can be very slow in larger result-sets.

An option is to fetch resultset with this statement (into an array):

SELECT sName FROM bpoint WHERE placeID=?; 

After that - use array_rand($resultset) to get a randomized item from the $resultset query.

Muhammad Hassaan
  • 7,296
  • 6
  • 30
  • 50
bestprogrammerintheworld
  • 5,417
  • 7
  • 43
  • 72
  • 2
    So, it is php function array_rand() faster? – Biker John Apr 29 '13 at 23:09
  • @JamaicaBob - it really depends on how large the resultset is. How many rows do you have in your table bpoint? – bestprogrammerintheworld Apr 29 '13 at 23:19
  • approximately 10.000 rows – Biker John Apr 29 '13 at 23:24
  • @JamaicaBob - ok, I THINK it would be faster with an array then, but I'm not sure. I guess you will just have to test and see difference. How long does the actual query take now? – bestprogrammerintheworld Apr 29 '13 at 23:30
  • Okay will do some tests and report :) – Biker John Apr 29 '13 at 23:53
  • 1
    @BikerJohn - Older question, but depending on your programming language, php has mt_rand(). I implemented this by doing a quick count of the number of rows I have, and then doing $random = mt_rand(1,$count) to get a random int in the range, then SELECT * FROM 'table' WHERE ID = $random; Performance numbers jumped significantly. Only works if you have unique IDs though (auto_increment ftw!) – Wes Jun 27 '15 at 17:41
  • @Wes What happens if you delete a row from table and mt_rand() gives you a non-existing ID? Your query would return an empty resultset. A much better solution that is guaranteed to return a result: $random = mt_rand(0, $count-1); SELECT * FROM table LIMIT $random, 1; And you don't need an auto incremented field for this to work. – pauk960 Jul 20 '16 at 04:27
  • @pauk960 so your solution would not return an empty result set, but I would be very interested to see the performance numbers of using LIMIT vs a direct select on an indexed field. I wrote my original response over a year ago and we have recently moved to an even faster solution that relies on precomputing the indexed IDs. – Wes Jul 21 '16 at 21:54
  • So the mysql is not good because the table is too big but then the php is ok because of what? And this is considered a solution - I'm disappointed. Start learning MySQL and don't cut corners. – obsergiu Oct 14 '17 at 21:45