I have three tables sharing common unique IDs and some other fields, let's say comments
, posts
, and tags
:
comments:
id | user_id | country_id | zone_id | created_at
posts:
id | user_id | country_id | zone_id | created_at
tags:
id | user_id | country_id | zone_id | created_at
What I want to do now is I want to get the count of rows of comments, posts, and tags, grouped by day based on created_at
column, and grouped by pairs of user_id
, country_id
and zone_id
, something like:
date | user_id | country_id | zone_id | count(comments.id) | count(posts.id) | count(tags.id)
The problem is that all three tables have millions of rows, so I want to use join without duplications as much as possible. I've come up with this:
select date(c.datetime), c.user_id, c.country_id, c.zone_id, count(distinct(c.id)), count(distinct(p.id)), count(distinct(t.id))
from comments c
inner join posts p
inner join tags t
group by date(c.datetime), c.user_id, c.country_id, c.zone_id;
Surprisingly this gives correct results, but it gives a lot of duplicated rows because of joins - it's bad also because maybe in the future I will want to use SUM
instead and I cannot use DISTINCT
anymore.
How can I join these three tables by those 3 foreign keys (user_id
, country_id
, zone_id
), so that I will get only distinct rows?