0

Currently, I am using this query in my PHP script:

SELECT * FROM `ebooks` WHERE `id`!=$ebook[id] ORDER BY RAND() LIMIT 125;

The database will be about 2500 rows big at max, but I've read that ORDER BY RAND() eventually will slow down the processing time as the data in the database grows.

So I am looking for an alternate method for my query to make things still run smoothly.

Also, I noticed that ORDER BY RAND() is not truly randomizing the rows, because often I see that it follows some kind of pattern that sometimes repeats over and over again.

Is there any method to truly randomize the rows?

PeeHaa
  • 71,436
  • 58
  • 190
  • 262
Kid Diamond
  • 2,232
  • 8
  • 37
  • 79
  • 3
    `ORDER BY RAND()` slows down, not `RAND()` itself. – hakre Mar 13 '12 at 16:58
  • 2
    The best way AFAIK is to do the randomizing PHP-side. For example, if your ID's are consistent, you can just get the count(*) from MySQL and then pick some of the id's at random. – Robus Mar 13 '12 at 16:58
  • 3
    Take a look http://stackoverflow.com/questions/1823306/mysql-alternatives-to-order-by-rand – a1ex07 Mar 13 '12 at 16:59
  • You can avoid the pattern by using seed values in your RAND call. This should work: RAND(UNIX_TIMESTAMP()) - it will provide unique seed values to RAND until about 2038. – Surreal Dreams Mar 13 '12 at 17:01
  • possible duplicate of [quick selection of a random row from a large table in mysql](http://stackoverflow.com/questions/211329/quick-selection-of-a-random-row-from-a-large-table-in-mysql) – fredley Mar 13 '12 at 17:13

2 Answers2

1

The RAND() function is a pseudo-random number generator and if you do not initialize it with different values will give you the same sequence of numbers, so what you should do is:

SELECT * FROM `ebooks` WHERE `id`!=$ebook[id] ORDER BY RAND(UNIX_TIMESTAMP()) LIMIT 125;

which will seed the random number generator from the current time and will give you a different sequence of numbers.

RAND() will slow down the SELECT's ORDER BY clause since it has to generate a random number every time and then sort by it. I would suggest you have the data returned to the calling program and randomize it there using something like array_rand.

Karlson
  • 2,958
  • 1
  • 21
  • 48
  • 2
    but unix_timestamp has a granularity of 1 second, so you'll only get a different sequence if there's >1 second between each query run. – Marc B Mar 13 '12 at 17:04
  • 1
    Which could lead to the point to provide some other value hardencoded in the query or sticking to seconds, this would actually make the query cacheable which could be useful depending on the field of use. – hakre Mar 13 '12 at 17:07
  • 1
    @MarcB I've provided a simplistic approach to the problem. You certainly use `gettimeofday` and calculate number of microseconds since epoch and use that but if you need to return multiple random sets within one second to the same session I would do this in cached dataset rather then by calling a query every single time. – Karlson Mar 13 '12 at 17:11
1

This question has already been answered:

quick selection of a random row from a large table in mysql

Here too:

http://snippetsofcode.wordpress.com/2011/08/01/fast-php-mysql-random-rows/

Community
  • 1
  • 1
roychri
  • 2,866
  • 1
  • 21
  • 29