1

I am trying to write a query to select all my forums and get the corresponding latest post (including the author)... but I failed.

This is my structure:

forums                      forum_threads              forum_posts
----------                  -------------             -----------
id                          id                        id
parent_forum (NULLABLE)     forum_id                  content
name                        user_id                   thread_id
description                 title                     user_id
icon                        views                     updated_at
                            created_at                created_at
                            updated_at
                            last_post_id (NULLABLE)

This is my current query:

SELECT forum.id, forum.name, forum.description, forum.icon, post_user.username
FROM forums AS "forum"
LEFT JOIN forum_posts AS "post" ON post.thread_id = (
    SELECT id
    FROM forum_threads
    WHERE forum_id = forum.id
    ORDER BY updated_at DESC LIMIT 1)
LEFT JOIN users AS "post_user" ON post_user.id = post.user_id
WHERE forum.parent_forum = 1
GROUP BY forum.id

Of course this query is incorrect, because there are many posts in one thread...

Can anyone help? I am using PostgreSQL btw.

Oh: I forgot: Currently I run through all "categories" (forums which have parent_forum = NULL) and then run an additional query for each forum (that's why you see parent_forum = 1 in my query). Is there a better way to do that?

EDIT: My last post is the post with newest date in updated_at in forum_posts

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
John Reese
  • 583
  • 2
  • 6
  • 17
  • I didn't quite understand your second question about `"categories"`. I suggest to open a *new* question for that. One issue per question is how it should be. You can always reference this one for context to safe some redundant typing. – Erwin Brandstetter Jul 07 '14 at 18:12
  • What I meant is that every forum has a "parent_forum". If "parent_forum" is NULL then it's a category with subforums. – John Reese Jul 07 '14 at 18:34

2 Answers2

2

DISTINCT ON should make this easier:

SELECT DISTINCT ON (f.id)
       f.id, f.name, f.description, f.icon, u.username
FROM   forums             f
LEFT   JOIN forum_threads t ON t.forum_id = f.id
LEFT   JOIN forum_posts   p ON p.thread_id = t.id
LEFT   JOIN users         u ON u.id = p.user_id
WHERE  f.parent_forum = 1
ORDER  BY f_id, p.updated_at DESC;

According to your Q update the latest post is the one with the latest forum_posts.updated_at.
Assuming the column is defined NOT NULL.

Detailed explanation:
Select first row in each GROUP BY group?

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • The query is not working.. when I am running it (with fixed syntax) I get this error: "column "u.username" must appear in the GROUP BY clause or be used in an aggregate function".. and with all the columns PgSQL wants to be grouped by added, it's not working (I get "NULL" as value for username) – John Reese Jul 07 '14 at 17:49
  • Sorry - I updated my question so the latest post is now defined. – John Reese Jul 07 '14 at 18:00
  • @CryNickSystems: Sorry, the `GROUP BY` line was an oversight, now removed. No `GROUP BY` in this query. – Erwin Brandstetter Jul 07 '14 at 18:07
  • It is still not working.. "username" stays NULL, although there are threads and posts belonging to the forums – John Reese Jul 07 '14 at 18:12
  • @CryNickSystems: You get the `username` for the *latest* post. Are you sure it's `NOT NULL`? You should have provided table definition (`\d tbl` in psql), where we can see how columns are defined ... – Erwin Brandstetter Jul 07 '14 at 18:14
  • Yes, I am definitely sure it's not null.. I don't think the definition helps as it is exactly how I posted in the question – John Reese Jul 07 '14 at 18:29
  • @CryNickSystems: Is `p.updated_at` unique (at least per forum)? Or can there be ties? Which post to pick in case of a tie? Otherwise the query should work as is. Pick a suspicious `forums.id` and run the query without `DISTINCT ON (f.id)` to debug. – Erwin Brandstetter Jul 07 '14 at 18:39
  • Okay, I see that it could be because there were two threads without posts oO and apparently your query must have chosen these two threads.. it's working now – John Reese Jul 07 '14 at 18:51
  • Before you updated the requirements I picked the latest post *from the latest thread* as per your original query. So you got your solution now. Good. – Erwin Brandstetter Jul 07 '14 at 19:17
1

Is this what you have in mind? You can get the latest post for a given forum using a subquery.

SELECT forums.id, forums.name, forums.description, forums.icon,
  (SELECT username FROM forum_threads AS ft
  JOIN forum_posts AS fp ON ft.id = fp.thread_id
  JOIN users AS u ON fp.user_id = u.user_id
  WHERE ft.forum_id = forums.id
  ORDER BY updated_at DESC LIMIT 1) AS username_of_latest_post
FROM forums
user2303197
  • 1,271
  • 7
  • 10
  • Your solution seems to be easy to understand, but what if I want to select multiple columns from the subquery? – John Reese Jul 07 '14 at 17:47
  • That makes it more tricky. You can try combining multiple values using a row constructor in a subquery, e.g. `SELECT ROW(username, user_id) ...` – user2303197 Jul 07 '14 at 20:57