Translated to SQL:
SELECT p.*
FROM posts p
LEFT JOIN (
SELECT DISTINCT ON (post_id)
post_id, user_id
FROM arguments
WHERE moderated = FALSE
ORDER BY post_id, id
) a ON a.post_id = p.id AND a.user_id = 5
WHERE a.post_id IS NOT NULL OR -- condition 1
(a.status = 1 and a.published); -- condition 2
Details for DISTINCT ON
:
Why the LEFT JOIN
?
If there was just condition 1)
, we could simplify:
SELECT p.*
FROM posts p
JOIN (
SELECT DISTINCT ...
) a ON a.post_id = p.id AND a.user_id = 5;
But you added the alternative (not additional) condition 2)
. If we would use a plain [INNER] JOIN
, posts not passing condition 1)
would be dropped from the selection immediately and never get their second chance. I added redundant parentheses (operator precedence would work for us without parentheses) and a line break to point out that this is "the other way in":
(a.status = 1 and a.published)
Of course, we have to add the check for condition 1)
now:
a.post_id IS NOT NULL
This works without duplicating rows because the subquery returns exactly 1 or 0 rows per post (and not more).
This is one way to solve it. Should be an excellent choice for only few attributes per post (2 - 4) and a non-trivial number of qualifying attributes. Depending on your actual data distribution there may be other, faster query techniques: