5

I need to fetch a repeatable random set of rows from a table using MySQL. I implemented this using the MySQL RAND function using the bigint primary key of the row as the seed. Interestingly this produces numbers that don't look random at all. Can anyone tell me whats going on here and how to get it to work properly?

select id from foo where rand(id) < 0.05 order by id desc limit 100

In one example out of 600 rows not a single one was returned. I change the select to include "id, rand(id)" and get rid of the rand clause in the where this is what I got:

| 163345 |  0.315191733944408 |
| 163343 |  0.814825518815616 |
| 163337 |  0.313726862253367 |
| 163334 |  0.563177533972242 |
| 163333 |  0.312994424545201 |
| 163329 |  0.312261986837035 |
| 163327 |  0.811895771708242 |
| 163322 |  0.560980224573035 |
| 163321 |  0.310797115145994 |
| 163319 |  0.810430896291911 |
| 163318 |  0.560247786864869 |
| 163317 |  0.310064677437828 |

Look how many 0.31xxx lines there are. Not at all random.

PS: I know this is slow but in my app the where clause limits the number of rows to a few 1000.

David Tinker
  • 9,383
  • 9
  • 66
  • 98

2 Answers2

11

Use the same seed for all the rows to do that, like:

select id from foo where rand(42) < 0.05 order by id desc limit 100

See the rand() docs for why it works that way. Change the seed if you want another set of values.

Mat
  • 202,337
  • 40
  • 393
  • 406
  • That sucks. Guess I should have read the docs first. My problem is that the query may be run repeatedly and if a row is selected once it must always be selected. So using a constant seed isn't going to work. I might have to filter the rows in the app - less efficient but will work or join to a table of random values. – David Tinker Nov 01 '11 at 12:39
  • I don't understand your problem. You can run the above queries however many times you like, it will return the same thing (assuming you're not changing the table data of course). – Mat Nov 01 '11 at 12:42
  • Unfortunately new rows are added to the table quite often – David Tinker Nov 01 '11 at 12:54
0

Multiply the decimal number returned by id:

select id from foo where rand() * id < 5 order by id desc limit 100
João
  • 120
  • 1
  • 10