0

I have a SELECT query that I am expecting millions of results from. I need to randomize these results in MySQL. Doing it in my script after the query obviously uses too much RAM. Can someone please rework this query so that the results are all random without using order by rand()? I have seen some examples and tried to use them but they don't work for me since they all seem to depend on returning the whole table rather than using a WHERE clause. Here is my query:

SELECT * FROM pool
WHERE gender = 'f'
AND (`location` = 'united states' OR `location` = 'us' OR `location` = 'usa');
xendi
  • 2,332
  • 5
  • 40
  • 64
  • 3
    What exactly have you tried? You've said "tried some examples", well what are they? What's the issue using ORDER BY RAND()? – Arran Jun 13 '12 at 22:06
  • the examples just really dont apply to my query so I didnt post them. ORDER BY RAND() doesn't work for large tables. the partition winds up filling up and the query takes years to complete on a 10 million row table. – xendi Jun 13 '12 at 22:12
  • `ORDER BY RAND()` with huge tables is extremely slow. it generates additional temporary table, more explanation here: – Zefiryn Jun 13 '12 at 22:13
  • i know not to use it for this query. i just don't know what I should use since I have this WHERE clause – xendi Jun 13 '12 at 22:20
  • How many rows are you trying to return? All of them? – Ami Jun 14 '12 at 00:48

2 Answers2

0

If you can use other language, for example php you can use its rand() function to generate ids and add the to the query like

$ids = range($mini, $maxid);
shuffle($ids);
array_slice($ids, 0, $quantity);

Or something similar in any language you are using.

If you need to do this in pure mysql query then here are some alternaties: http://www.electrictoolbox.com/msyql-alternative-order-by-rand/

Zefiryn
  • 2,240
  • 2
  • 20
  • 30
  • bringing 10million ids into php and shuffling them wont work. its too much data to hold in RAM. I saw that article but it assumes you want the whole table. If I add a WHERE clause it wont work. – xendi Jun 13 '12 at 22:18
  • Then write a function that will generate `$quantity` numbers of ids from the given range. I assume you need for example 50 rows from that big table at once. – Zefiryn Jun 13 '12 at 22:23
0

If you have 10 million lines with id's, are they in a contiguous range?

  1. Get the lowest id from the range you want via a quick select.
  2. Get the largest id as well.
  3. generate random numbers in this range using php
  4. once you have your numbers "SELECT * FROM table1 WHERE id IN (the numbers you generated)" or something like that
colonelclick
  • 2,165
  • 2
  • 24
  • 33