0

I am looking for more performance for my project (PHP+MySQL), have a query that seem is too slow (select 2 random users from 1 table)

id  | name |  total | img
------------------------ --
1    user1   500      1
2    user2   600      2
3    user3   650      3

__

SELECT id1, id2, name1, name2, img1, img2, total1, total2
FROM (
  SELECT
    C1.id AS id1, C1.img AS img1, C1.name AS name1,
    C2.id AS id2, C2.img AS img2, C2.name AS name2,
    C1.total AS total1, C2.total AS total2
  FROM users C1, users C2
  WHERE C1.id <> C2.id
    AND ABS(C1.total - C2.total) < 200
) as t
ORDER BY RAND()
LIMIT 1

result

id1  | id2|  name1 | name2 | img1 | img2 |  total1 | total2
------------------------ -------------------------------------
1       3    user1   user3    1      3        500      650

is any way to improve it?

JodyT
  • 4,324
  • 2
  • 19
  • 31
jhenya-d
  • 399
  • 7
  • 19

3 Answers3

1

Make sure that you created indexes for all columns that are in the where clause:

CREATE TABLE `users` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`name` VARCHAR( 255 ) NOT NULL ,
`total` INT NOT NULL ,
`img` INT NOT NULL ,
INDEX ( `total`)
);

Also note that the following query (without the sub query) will give you the same results

SELECT
  C1.id AS id1, C1.img AS img1, C1.name AS name1,
  C2.id AS id2, C2.img AS img2, C2.name AS name2,
  C1.total AS total1, C2.total AS total2
FROM users C1, users C2
WHERE C1.id <> C2.id 
  AND ABS(C1.total - C2.total) < 200
ORDER BY RAND()
LIMIT 1

You can check that at sql fiddle

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • 1
    If the table is already created (but the indexes are missing), use `alter table users add index idx_total(total)` – Barranka Mar 07 '13 at 22:55
1

You may be able to optimize the inner SELECT, but using ORDER BY RAND() will always turn the performance to... I can't think of a better word than sh*t. Basically you're telling your DBMS to ingore all indexes and reorder the entire dataset, and once that dataset becomes anything more than 'tiny' it will be a very noticeable performance hit.

I just wrote another answer today about this which should work as long as you don't mind the results not necessarily being different every time.

edit

I just noticed that you're only selecting one row. Try this:

$rs = $dbh->query(
"SELECT COUNT(*) AS 'count'
 FROM users C1, users C2
 WHERE C1.id <> C2.id
   AND ABS(C1.total - C2.total) < 200");
$target = rand(0,$rs[0]['count']);
$rs = $dbh->query(
  "SELECT 
   C1.id AS id1, C1.img AS img1, C1.name AS name1,
   C2.id AS id2, C2.img AS img2, C2.name AS name2,
   C1.total AS total1, C2.total AS total2
  FROM users C1, users C2
  WHERE C1.id <> C2.id
   AND ABS(C1.total - C2.total) < 200
  LIMIT ?,1",
array($target));

It will make use of your indexes, and won't require a re-sort of a potentially large dataset.

It's also not really a super idea to self join a table with the condition that the primary key is not equal to itself, you are essentially squaring the size of your dataset. A table with 1000 rows will yield a set with 999,000 rows. I think rolling the 'total' condition into an explicit JOIN will reduce the performance hit, but I'm not certain.

Change:

  FROM users C1, users C2
  WHERE C1.id <> C2.id
   AND ABS(C1.total - C2.total) < 200

To:

  FROM users C1 INNER JOIN users C2
   ON C1.id <> C2.id
     AND ABS(C1.total - C2.total) < 200
Community
  • 1
  • 1
Sammitch
  • 30,782
  • 7
  • 50
  • 77
  • foreach ($rs as $row) { print $row['name1']; } I get empty result, what I missed --- connection is ok: $dbh = new PDO ("mysql:host=$hostname;dbname=$dbname","$username","$pw"); --- – jhenya-d Mar 08 '13 at 10:38
0

As suggested already, you can optimize it by using indexes. However, I would recommend doing the "random" logic programmatically. Although its perfectly fine to use ORDER BY RAND() for smaller tables, its highly inefficient for larger ones (for example, if you had a table with 10,000 records, it would have to generate 10,000 random numbers and (AFAIK) choose the smallest one).

I would suggest using two queries. Select the COUNT, generate a random number from that, and then use that value in your LIMIT clause.

Example:

//get the total number of rows
$result= mysql_query(" SELECT  COUNT(*) AS total FROM `table` ");
$row = mysql_fetch_array($result);
$total=$row['total'];

//create random value from 1 to the total of rows 
$randomvalue =rand(1,$total);

//get the random row
$result= mysql_query(" SELECT  * FROM `table` limit $randomvalue,1");


In your particular case, you could generate two random numbers and select two users (just make sure the random numbers aren't equal).

EDIT: similar example found here.

Community
  • 1
  • 1
crazylpfan
  • 1,038
  • 7
  • 9