database for random selection

Question

I wanted to be able to randomly select rows from a table in a way that scales up to large tables in MySQL. As I understand it now that really isn't possible. Are there any other databases that I could do this with?

it is possible with mysql (efficiently) if you are not deleting from the table. so it's worth asking - do you ever expect to delete entries for this table? — andrew cooke, Aug 10 '11 at 23:39
If you want one row, this scales well with huge tables: http://stackoverflow.com/questions/6541644/selecting-random-rows-with-mysql/6542113#6542113 — ypercubeᵀᴹ, Aug 11 '11 at 14:36
I'll need to be able to add and remove stuff and right now I just want one row. — HahaHortness, Aug 12 '11 at 00:05

score 1 · Answer 1 · answered Aug 10 '11 at 23:35

1

Try this:

SELECT ColumnName 
FROM TableName
ORDER BY RAND()
LIMIT 1;

answered Aug 10 '11 at 23:35

Brian Willis

22,768
9
46
50

1

`ORDER BY RAND()` really doesn't scale well when you're talking millions of rows. – Mike Sherrill 'Cat Recall' Aug 10 '11 at 23:36

score 1 · Accepted Answer · answered Aug 10 '11 at 23:41

Does this type of selection need to be made with SQL? If not, you could generate a random number in code and then get a related record by the generated number. Of course that is making a few assumptions:

Would require that your Id's are an identity seed
your random number is bounded by min & max values.

Chains · Answer 3 · 2011-08-11T14:09:34.843

0

I think this can be done... Sorry -- this is TSQL, but FWIW

Declare 
   @max int, 
   @min int, 
   @rand int

set @max = (select max(id) from myTable)
set @min = (select min(id) from myTable)
set @rand = Cast(((@max + 1) - @min) * Rand() + @min as int)

select * from myTable where id = @rand

Or if you've got gaps in your primary key fields, then:

select max(id) from myTable where id <= @rand

Per @andrew's objection -- he's concerned that this would favor lower id numbers in un-matched contests. If that's a concern, a couple of ways you could handle it:

Take 2 records at a time (a min and a max). For example:

INSERT INTO whatever_table_variable_etc
select max(id) from myTable where id <= @rand
UNION
select min(id) from myTable where id >= @rand

Or you could base it on odds and evens. For example:

Declare @isOdd bool
SET @isOdd= CASE WHEN ABS(@rand) % 2 = 1 THEN 1 ELSE 0 END

If @isOdd = 1
   select max(id) from myTable where id <= @rand
ELSE 
   select min(id) from myTable where id >= @rand

edited Aug 11 '11 at 14:09

answered Aug 10 '11 at 23:47

Chains

12,541
8
45
62

2

this isn't very good if you've deleted from the table, because deletion leaves gaps in the ids. the gaps means that you pick the entries before (or after, depending on exactly how you do the query) the gaps more often than other answers (because effectively they are picked for each time they or a missing id is selected). – andrew cooke Aug 11 '11 at 00:05
@andrew -- it's just conceptual. in answer to your specific objection, it would be easy enough to deal with by switching back and forth from max to min, so the distribution would be even. No doubt he'll want more than just one record. – Chains Aug 11 '11 at 13:50
@andrew -- sure it would. Anyway, time to put an answer up yourself, if you can. – Chains Aug 11 '11 at 14:12
i don't have a good solution. it's a known, hard problem. i'm just telling you that it's not as easy as you think. – andrew cooke Aug 11 '11 at 14:57
@andrew -- no doubt, the pure randomness problem hasn't been solved, and I don't think it every will be. But as for a practical application (i.e., something that not only works reasonably well, but also scales well to large tables), I think you have to ignore the extreme edge cases -- there's no way to catch them all -- or at least, nobody's managed to do it yet. – Chains Aug 11 '11 at 15:19

database for random selection

3 Answers3