0

I have a database that consists of users who can perform various actions, which I keep track of in multiple tables. I'm creating a point system, so I need to count how many of each type of action the user did. For example, if I had:

users                 posts                comments             shares

id | username         id | user_id         id | user_id         id | user_id
-------------         --------------       --------------       --------------
1  |  abc             1  | 1               1  | 1               1  | 2
2  |  xyz             2  | 1               2  | 2               2  | 2

I would want to return:

user_details

id | username | post_count | comment_count | share_count
---------------------------------------------------------
1  | abc      | 2          | 1             | 0
2  | xyz      | 0          | 1             | 2

This is slightly different from this question about foreign key counts since I want to return the individual counts per table.

What I've tried so far (example code):

SELECT 
  users.id, 
  users.username,
  COUNT( DISTINCT posts.id ) as post_count,
  COUNT( DISTINCT comments.id ) as comment_count, 
  COUNT( DISTINCT shares.id ) as share_count
FROM users
  LEFT JOIN posts ON posts.user_id = users.id
  LEFT JOIN comments ON comments.user_id = users.id
  LEFT JOIN shares ON shares.user_id = users.id
GROUP BY users.id

While this works, I had to use DISTINCT in all of my counts because the LEFT JOINS were causing high numbers of duplicate rows. I feel like there must be a better way to do this since (please correct me if I'm wrong) on each LEFT JOIN, the DISTINCT is having to filter out an exponentially growing number of duplicated rows.

Thank you so much for any help you could give me with this!

hotdog
  • 5
  • 3
  • Join with grouped data. – Serg Jan 23 '22 at 18:49
  • *I had to use DISTINCT in all of my counts because the LEFT JOINS were causing high numbers of duplicate rows* - whacking DISTINCT in to remove volumes of duplicates introduced by Cartesian explosion from faulty joins is never the solution. Fixing the joins is. You can prevent an exponential growth of a dataset by making sure that data joined in is at most 1:1 with existing data – Caius Jard Jan 23 '22 at 20:13
  • The answer you linked to is still the good approach, just turn the `+` to `,` so that you get the columns separately. – jjanes Jan 24 '22 at 00:37

1 Answers1

0

You can join derived tables that already do the aggregation.

SELECT u.id,
       u.username,
       coalesce(pc.c, 0) AS post_count,
       coalesce(cc.c, 0) AS comment_count,
       coalesce(sc.c, 0) AS share_count
       FROM users AS u
            LEFT JOIN (SELECT p.user_id,
                              count(*) AS cc
                              FROM posts AS p
                              GROUP BY p.user_id) AS pc
                      ON pc.user_id = u.id
            LEFT JOIN (SELECT c.user_id,
                              count(*) AS 
                              FROM comments AS c
                              GROUP BY c.user_id) AS cc
                      ON cc.user_id = u.id                              
            LEFT JOIN (SELECT s.user_id,
                              count(*) AS c
                              FROM shares AS s
                              GROUP BY s.user_id) AS sc
                      ON sc.user_id = u.id;
sticky bit
  • 36,626
  • 12
  • 31
  • 42
  • Thanks - this was really clear and helped me understand concretely what the other people were saying. Thanks for helping me learn something new! – hotdog Jan 31 '22 at 19:15