2

I wanted to be able to randomly select rows from a table in a way that scales up to large tables in MySQL. As I understand it now that really isn't possible. Are there any other databases that I could do this with?

HahaHortness
  • 1,570
  • 1
  • 15
  • 16

3 Answers3

1

Try this:

SELECT ColumnName 
FROM TableName
ORDER BY RAND()
LIMIT 1;
Brian Willis
  • 22,768
  • 9
  • 46
  • 50
1

Does this type of selection need to be made with SQL? If not, you could generate a random number in code and then get a related record by the generated number. Of course that is making a few assumptions:

  • Would require that your Id's are an identity seed
  • your random number is bounded by min & max values.
Jesse
  • 8,223
  • 6
  • 49
  • 81
0

I think this can be done... Sorry -- this is TSQL, but FWIW

Declare 
   @max int, 
   @min int, 
   @rand int

set @max = (select max(id) from myTable)
set @min = (select min(id) from myTable)
set @rand = Cast(((@max + 1) - @min) * Rand() + @min as int)

select * from myTable where id = @rand

Or if you've got gaps in your primary key fields, then:

select max(id) from myTable where id <= @rand

Per @andrew's objection -- he's concerned that this would favor lower id numbers in un-matched contests. If that's a concern, a couple of ways you could handle it:

Take 2 records at a time (a min and a max). For example:

INSERT INTO whatever_table_variable_etc
select max(id) from myTable where id <= @rand
UNION
select min(id) from myTable where id >= @rand

Or you could base it on odds and evens. For example:

Declare @isOdd bool
SET @isOdd= CASE WHEN ABS(@rand) % 2 = 1 THEN 1 ELSE 0 END

If @isOdd = 1
   select max(id) from myTable where id <= @rand
ELSE 
   select min(id) from myTable where id >= @rand
Chains
  • 12,541
  • 8
  • 45
  • 62
  • 2
    this isn't very good if you've deleted from the table, because deletion leaves gaps in the ids. the gaps means that you pick the entries before (or after, depending on exactly how you do the query) the gaps more often than other answers (because effectively they are picked for each time they or a missing id is selected). – andrew cooke Aug 11 '11 at 00:05
  • @andrew -- it's just conceptual. in answer to your specific objection, it would be easy enough to deal with by switching back and forth from max to min, so the distribution would be even. No doubt he'll want more than just one record. – Chains Aug 11 '11 at 13:50
  • @andrew -- sure it would. Anyway, time to put an answer up yourself, if you can. – Chains Aug 11 '11 at 14:12
  • i don't have a good solution. it's a known, hard problem. i'm just telling you that it's not as easy as you think. – andrew cooke Aug 11 '11 at 14:57
  • @andrew -- no doubt, the pure randomness problem hasn't been solved, and I don't think it every will be. But as for a practical application (i.e., something that not only works reasonably well, but also scales well to large tables), I think you have to ignore the extreme edge cases -- there's no way to catch them all -- or at least, nobody's managed to do it yet. – Chains Aug 11 '11 at 15:19