0

I have a list user_id of N integers, e.g.

[1001, 1023, 13452, 1679834, ...]

and a table:

CREATE TABLE content (
  id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
  user_id INT,
  content VARCHAR(100),
  score INT
);

I need to take those N integers from user_id and for each user_id get the top 3 content that has the highest score. So basically I need to run this query N times:

SELECT *
FROM content
WHERE user_id=1001
ORDER BY score DESC
LIMIT 3;

N could be a very large number. So I'd very much want to avoid running those queries one-by-one.

Is there any way to reduce the number of queries I need to run? Some sort of bulk select perhaps?

newtover
  • 31,286
  • 11
  • 84
  • 89
Continuation
  • 12,722
  • 20
  • 82
  • 106

1 Answers1

1

This should work:

$str_ids = implode(', ', $arr_ids);

SELECT id, user_id, content, score
FROM (  SELECT *, (@rownum := @rownum + 1) AS rownum, 
            case when @user_id IS NULL then @user_id := c.user_id when @user_id != c.user_id then CONCAT(@rownum := 0, @user_id := c.user_id) AS dummy_value
        FROM (  SELECT *
                FROM content
                WHERE user_id IN ({$str_ids})
                ORDER BY user_id ASC, score DESC) AS c, (@rownum := 1, @user_id := NULL) AS vars
        HAVING rownum <= 3

Maybe there's a better way to do this. If such; let me know!

Robin Castlin
  • 10,956
  • 1
  • 28
  • 44
  • There are definitely more efficient variations, but the idea is the same. – newtover Mar 21 '12 at 22:18
  • @newtover - what would be a more efficient variation? – Continuation Mar 21 '12 at 22:26
  • @continuation, the most efficient is to walk through the index during an index scan and get the required ids and then join for the rest of the fields. The given solution creates twice a full copy of the table and then applies a condition in a full scan. Besides, it can not fully use any index because of ASC and DESC at the same time. – newtover Mar 21 '12 at 22:40
  • @continuation, I am wrong, not a full copy, just a subset for the given users. But I would anyway look at the similar questions and compare the variants on your data. – newtover Mar 21 '12 at 22:45
  • Have you tried this solution? The reason I said there might be better ones is because this query could considered a bit confusing, not majorly inefficient. – Robin Castlin Mar 22 '12 at 07:51