The performance problem with your query is not the rand()
per se but the order by
on a large number of rows. If you know that foo has a certain number of rows, say 1,000,000, then something like this will run much faster:
SELECT *
FROM foo
WHERE RAND() < 0.00001
ORDER BY RAND()
LIMIT 20;
The where
clause reduces the number of rows for the order by
to about 100. And, you would be very confident that the number would be at least 20. You can automate this calculation as well:
SELECT *
FROM foo cross join
(SELECT count(*) as cnt FROM foo) const
WHERE RAND() < 100.0/cnt
ORDER BY RAND()
LIMIT 20;