3

I need to retrieve a number of random records from a table. The table contains about 100000 records.

The records must not be consecutive and performance is important.

I tried "order by rand()" but the performance is bad ( ~3 seconds)

Haim Evgi
  • 123,187
  • 45
  • 217
  • 223
Daniel
  • 341
  • 6
  • 24
  • related http://stackoverflow.com/questions/3558665/randomizing-large-dataset – Haim Evgi Feb 22 '11 at 13:26
  • What does the primary key's contents look like? What other unique attributes or attribute combinations does that table have? – TheBlastOne Feb 22 '11 at 13:27
  • I don't think you can go faster unless some other filtering criteria... The engine STILL has to go through 100,000 records AND assign temporary "RAND()" to the set (internally) for the order being applied. – DRapp Feb 22 '11 at 13:28
  • You said, "The records must not be consecutive". Depending on how many rows you have and how many you want, you might get a few that are "consecutive" (whatever that means). – Mike Sherrill 'Cat Recall' Feb 22 '11 at 13:29
  • 1
    @catcall, I think he means that the records in the database are not necessarily consecutive `(1,2,4,5,6,9)` in stead of `(1,2,3,4,5)` – gnur Feb 22 '11 at 13:34
  • 1
    Have you tried getting entropy from another source, such as dumping a PRNG into a file? You could get 50 unsigned integers rather quickly that way, so the only work is in the query. – Tim Post Feb 22 '11 at 13:34
  • @TheBlastOne - the key is autonumber. I also have a unique slug (varchar) – Daniel Feb 27 '11 at 16:22
  • If there are not many gaps in the key, we all know an efficient way to select random key values, don't we ;) since this is very obvious, I won't make that a separate answer :) – TheBlastOne Feb 28 '11 at 10:58
  • I would say thanks beacuse your Rand() Saved my day Thanks indeed – Humphrey Apr 17 '13 at 12:47

3 Answers3

0

You could try to do it in a php loop, but I doubt it will be faster..

$iMaxID = getMaxIdFromYourTable(); //not real php
$records = array();
while (true) {
    $iRandID = rand(1,$iMaxID);
    thisRecord = "SELECT FROM yourtable WHERE id = $iRandID";
    if (numrows > 0) {
        $records[] = thisRecord;
        if (count($records) > 50) {
            break;
        }
    }
}
gnur
  • 4,671
  • 2
  • 20
  • 33
0

I just ran a simple SELECT * FROM table ORDER BY RAND() LIMIT 50; on a table with 229,291 rows. Completed in 0.63 seconds. Given, that RAND() is really slow and there should be a better solution.

I am only aware of various alternate way to select just one random row from a large data set, faster than using ORDER BY RAND(). which is explained here: http://wanderr.com/jay/order-by-slow/2008/01/30/

For multiple random rows, I am currently unaware of a better solution. If the ID's are subsequent and no id's are missing in between, you could generate a comma separated list of random numbers in your own PHP code - and then using a SELECT * FROM table WHERE id IN(5,3,1); query.

Here is a working solution, in another Stack Overflow question: How can i optimize MySQL's ORDER BY RAND() function?

Worked for me, 50 rows in set (0.09 sec) for the table with 229,291 records.

Community
  • 1
  • 1
CharlesLeaf
  • 3,201
  • 19
  • 16
0
SET @o = (SELECT FLOOR(RAND() * COUNT(*)) FROM your_table);
PREPARE STMT FROM 'SELECT * FROM your_table LIMIT ?, 1';
EXECUTE STMT USING @o;
BlueMonkMN
  • 25,079
  • 9
  • 80
  • 146