2

I have this SQL query to get related comments for each ID in a list:

SELECT comments.id, comments.body, comments.created_at
     , comments.like_count, comments.post_id
FROM   comments
WHERE  comments.is_active AND comments.is_show 
AND    comments.post_id in (1, 7, 9, 11, 3)
GROUP BY comments.id
ORDER BY comments.id
LIMIT 3   <----- this is not working correctly!

I want to get the top 3 comments for each post id in the given list (1, 7, 9, 11, 3).
How to achieve this?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Hadi Farhadi
  • 1,773
  • 2
  • 16
  • 33

3 Answers3

1

One option would be to use row_number. Something like:

SELECT c.id, c.body, c.created_at, c.like_count, c.post_id
FROM (
 SELECT comments.*, row_number() OVER (PARTITION BY post_id ORDER BY id) rn
 FROM comments
 WHERE comments.is_active AND comments.is_show 
  AND  comments.post_id in (1, 7, 9, 11, 3)
) c
WHERE rn <= 3
EdmCoff
  • 3,506
  • 1
  • 9
  • 9
1

You can use a ROW_NUMBER to get 3 comments for every postid

WITH CTE AS (
SELECT comments.id, comments.body, comments.created_at
       , comments.like_count, comments.post_id
       ,ROW_NUMBER() OVER( PARTITION BY post_id  ORDER BY id) rn
 FROM comments
 WHERE comments.is_active AND comments.is_show 
       AND  comments.post_id in (1, 7, 9, 11, 3)
)
SELECT id, body, created_at, like_count, post_id
FROM CTE WHERE rn < 4
ORDER BY id
nbk
  • 45,398
  • 8
  • 30
  • 47
1

This will nuke the performance of solutions suggested so far:

SELECT c.*
FROM   unnest('{1, 7, 9, 11, 3}'::int[]) AS p(post_id)  -- assuming integer?
CROSS  JOIN LATERAL (
   SELECT id, body, created_at, like_count, post_id
   FROM   comments c
   WHERE  is_active
   AND    is_show 
   AND    c.post_id = p.post_id
   ORDER  BY like_count DESC NULLS LAST  -- assuming this defines "top"?
   LIMIT  3
   ) c
ORDER  BY id, like_count DESC NULLS LAST;  -- see below

Needs an index on (post_id, like_count) to be fast.

Unlike slow solutions with row_number(), which have to scan the whole comments table, this can identify the top 3 comments per post_id from the index cheaply. For big tables, this is faster by orders of magnitude.

See:

If like_count is defined NOT NULL (as it probably should), you can simplify to ORDER BY id, like_count DESC. Else you need DESC NULLS LAST, and the optimal index is on (post_id, like_count DESC NULLS LAST). About that:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228