I have a table containing user records, and I want to take 5 records per user_id
. I don't care about any sort of order. I could do this to get 1 record per user:
SELECT * FROM records GROUP BY user_id
I could also do something with user variables to take the top N records. However, my database is very large and a query with user variables isn't efficiently using the index on the user_id
column because it has to sort within each group. I don't care about order at all, so I shouldn't have to touch records that aren't relevant. Since I only want 5 and each user has 200-400 records, this is a big performance hit.
Is there a way to write this query efficiently?
This question is not the same as asking how to get the top N records, because I don't care about ordering and I believe that removing that restriction should allow an efficient rewriting. If this is not the case, please explain why not. I have clarified this in the title.