I wanted to be able to randomly select rows from a table in a way that scales up to large tables in MySQL. As I understand it now that really isn't possible. Are there any other databases that I could do this with?
Asked
Active
Viewed 103 times
2
-
it is possible with mysql (efficiently) if you are not deleting from the table. so it's worth asking - do you ever expect to delete entries for this table? – andrew cooke Aug 10 '11 at 23:39
-
@Jarl: Select randomly one row or many rows? – ypercubeᵀᴹ Aug 11 '11 at 14:33
-
If you want one row, this scales well with huge tables: http://stackoverflow.com/questions/6541644/selecting-random-rows-with-mysql/6542113#6542113 – ypercubeᵀᴹ Aug 11 '11 at 14:36
-
I'll need to be able to add and remove stuff and right now I just want one row. – HahaHortness Aug 12 '11 at 00:05
3 Answers
1
Try this:
SELECT ColumnName
FROM TableName
ORDER BY RAND()
LIMIT 1;

Brian Willis
- 22,768
- 9
- 46
- 50
-
1`ORDER BY RAND()` really doesn't scale well when you're talking millions of rows. – Mike Sherrill 'Cat Recall' Aug 10 '11 at 23:36
1
Does this type of selection need to be made with SQL? If not, you could generate a random number in code and then get a related record by the generated number. Of course that is making a few assumptions:
- Would require that your Id's are an identity seed
- your random number is bounded by min & max values.

Jesse
- 8,223
- 6
- 49
- 81
0
I think this can be done... Sorry -- this is TSQL, but FWIW
Declare
@max int,
@min int,
@rand int
set @max = (select max(id) from myTable)
set @min = (select min(id) from myTable)
set @rand = Cast(((@max + 1) - @min) * Rand() + @min as int)
select * from myTable where id = @rand
Or if you've got gaps in your primary key fields, then:
select max(id) from myTable where id <= @rand
Per @andrew's objection -- he's concerned that this would favor lower id numbers in un-matched contests. If that's a concern, a couple of ways you could handle it:
Take 2 records at a time (a min and a max). For example:
INSERT INTO whatever_table_variable_etc
select max(id) from myTable where id <= @rand
UNION
select min(id) from myTable where id >= @rand
Or you could base it on odds and evens. For example:
Declare @isOdd bool
SET @isOdd= CASE WHEN ABS(@rand) % 2 = 1 THEN 1 ELSE 0 END
If @isOdd = 1
select max(id) from myTable where id <= @rand
ELSE
select min(id) from myTable where id >= @rand

Chains
- 12,541
- 8
- 45
- 62
-
2this isn't very good if you've deleted from the table, because deletion leaves gaps in the ids. the gaps means that you pick the entries before (or after, depending on exactly how you do the query) the gaps more often than other answers (because effectively they are picked for each time they or a missing id is selected). – andrew cooke Aug 11 '11 at 00:05
-
@andrew -- it's just conceptual. in answer to your specific objection, it would be easy enough to deal with by switching back and forth from max to min, so the distribution would be even. No doubt he'll want more than just one record. – Chains Aug 11 '11 at 13:50
-
@andrew -- sure it would. Anyway, time to put an answer up yourself, if you can. – Chains Aug 11 '11 at 14:12
-
i don't have a good solution. it's a known, hard problem. i'm just telling you that it's not as easy as you think. – andrew cooke Aug 11 '11 at 14:57
-
@andrew -- no doubt, the pure randomness problem hasn't been solved, and I don't think it every will be. But as for a practical application (i.e., something that not only works reasonably well, but also scales well to large tables), I think you have to ignore the extreme edge cases -- there's no way to catch them all -- or at least, nobody's managed to do it yet. – Chains Aug 11 '11 at 15:19