0

I need a fast query for this table:

id|url|domain

example:

   ID   URL                             Domain
    1   http://www.google.de/example1   http://www.google.de
    2   http://www.google.de/example2   http://www.google.de
    3   http://www.google.de/example3   http://www.google.de
    4   http://www.yahoo.de/example1    http://www.yahoo.de
    5   http://www.yahoo.de/example1    http://www.yahoo.de
    6   http://www.yahoo.de/example1    http://www.yahoo.de

The table contains 1 millions rows...so a query should return a respond very fast..

for example i like to get 1000 unique random urls without having duplicate domains in it....

i tried something like:

SELECT x.* 
    FROM ( SELECT * 
               FROM table 
               ORDER BY RAND() ) x 
    GROUP BY domain LIMIT 1000

but it takes 1 minute to gave me some results...thats too time consuming

table structure is myisam, but could be converted to anything else if i got faster results

need some help

thank you

Joe Stefanelli
  • 132,803
  • 19
  • 237
  • 235
  • Do you need "real" randomness? If not, you can generate the random numbers in an array (or another table) separately and get chunks (e.g. via an offset). – Laur Ivan Dec 05 '11 at 21:47
  • yep, each query has to return a nearly different result set, – user1082391 Dec 05 '11 at 22:45
  • This might be of some use: http://stackoverflow.com/questions/249301/simple-random-samples-from-a-mysql-database . I agree with their conclusion: anything other than SQL would probably be more efficient. – Laur Ivan Dec 05 '11 at 22:53
  • "; var_dump($row2); echo ""; } echo "done"; ?> i just get 10 random urls but takes about 50 seconds..... – user1082391 Dec 05 '11 at 23:09
  • the code format....i dont understand....lol – user1082391 Dec 05 '11 at 23:15
  • @user1082391 Don't post obfuscated code into a comment - if it's an answer, post it as an answer with all the pretty formatting that allows. – Cylindric Dec 08 '11 at 15:19

1 Answers1

0
SELECT  RAND() AS 'my_rand', t.*
FROM table t 
GROUP_BY domain
ORDER BY my_rand LIMIT 1000

I'm not sure how truly random that is, but it returns seemingly random results in my test database.

I'm not sure you can do this in a simple query without scanning all of the rows.

Ben English
  • 3,900
  • 2
  • 22
  • 32