This is a similar question to this question, but differs slightly.
In the simplest way possible, I have a table that comprises of the following data:
| id | post_id | time_stamp |
|----|---------|----------------------|
| 1 | 1 | 2022-05-07T03:07:18Z |
| 2 | 1 | 2022-05-07T03:07:19Z |
| 3 | 1 | 2022-05-07T03:07:20Z |
| 4 | 1 | 2022-05-07T03:07:21Z |
| 5 | 2 | 2022-05-07T03:07:18Z |
| 6 | 2 | 2022-05-07T03:07:19Z |
| 7 | 2 | 2022-05-07T03:07:20Z |
| 8 | 2 | 2022-05-07T03:07:21Z |
| 9 | 3 | 2022-05-07T03:07:02Z |
| 10 | 3 | 2022-05-07T03:07:03Z |
| 11 | 3 | 2022-05-07T03:07:04Z |
| 12 | 3 | 2022-05-07T03:07:05Z |
| 13 | 3 | 2022-05-07T03:07:06Z |
| 14 | 3 | 2022-05-07T03:07:07Z |
| 15 | 4 | 2022-05-07T03:07:02Z |
| 16 | 4 | 2022-05-07T03:07:03Z |
| 17 | 4 | 2022-05-07T03:07:04Z |
| 18 | 4 | 2022-05-07T03:07:05Z |
| 19 | 5 | 2022-05-07T03:07:06Z |
| 20 | 5 | 2022-05-07T03:07:07Z |
Essentially, I'd like to get the three most recent results from each post_id
, and specifically for only a few post_id
's.
Currently I'm using the following query (inspired from the question linked at the top):
SELECT r.*
FROM post_reply r
WHERE (
SELECT COUNT(*)
FROM post_reply b
WHERE b.post_id = r.post_id AND b.time_stamp >= r.time_stamp
) <= 3
and r.post_id in (1, 2, 3)
ORDER BY r.post_id ASC, r.time_stamp DESC;
Which returns the following result:
| id | post_id | time_stamp |
|----|---------|----------------------|
| 4 | 1 | 2022-05-07T03:07:21Z |
| 3 | 1 | 2022-05-07T03:07:20Z |
| 2 | 1 | 2022-05-07T03:07:19Z |
| 8 | 2 | 2022-05-07T03:07:21Z |
| 7 | 2 | 2022-05-07T03:07:20Z |
| 6 | 2 | 2022-05-07T03:07:19Z |
| 14 | 3 | 2022-05-07T03:07:07Z |
| 13 | 3 | 2022-05-07T03:07:06Z |
| 12 | 3 | 2022-05-07T03:07:05Z |
This works just fine, until you begin to have some entries with duplicate timestamps.
For the next dataset, we'll use some duplicate values for the timestamps:
| id | post_id | time_stamp |
|----|---------|----------------------|
| 1 | 1 | 2022-05-07T03:07:18Z |
| 2 | 1 | 2022-05-07T03:07:18Z |
| 3 | 1 | 2022-05-07T03:07:18Z |
| 4 | 1 | 2022-05-07T03:07:18Z |
| 5 | 2 | 2022-05-07T03:07:18Z |
| 6 | 2 | 2022-05-07T03:07:18Z |
| 7 | 2 | 2022-05-07T03:07:18Z |
| 8 | 2 | 2022-05-07T03:07:19Z |
| 9 | 3 | 2022-05-07T03:07:19Z |
| 10 | 3 | 2022-05-07T03:07:03Z |
| 11 | 3 | 2022-05-07T03:07:04Z |
| 12 | 3 | 2022-05-07T03:07:05Z |
| 13 | 3 | 2022-05-07T03:07:06Z |
| 14 | 3 | 2022-05-07T03:07:07Z |
| 15 | 4 | 2022-05-07T03:07:02Z |
| 16 | 4 | 2022-05-07T03:07:03Z |
| 17 | 4 | 2022-05-07T03:07:04Z |
| 18 | 4 | 2022-05-07T03:07:05Z |
| 19 | 5 | 2022-05-07T03:07:06Z |
| 20 | 5 | 2022-05-07T03:07:07Z |
Which returns much less data:
| id | post_id | time_stamp |
|----|---------|----------------------|
| 8 | 2 | 2022-05-07T03:07:19Z |
| 9 | 3 | 2022-05-07T03:07:19Z |
| 14 | 3 | 2022-05-07T03:07:07Z |
| 13 | 3 | 2022-05-07T03:07:06Z |
Previously, I tried a different approach to the query by using ORDER BY
and LIMIT
in the subquery:
SELECT *
FROM post_reply r
WHERE r.id IN (
SELECT pr.id
FROM post_reply pr
WHERE pr.id = p.id
ORDER BY time_stamp DESC
LIMIT 3
)
AND pr.id IN (1, 2, 3)
However, I'm met with This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
I'm SOL and out of ideas and could definitely use some help understanding how I can reliably achieve what I'd like to do.