-1

I am trying to understand the best way to merge a count from an INNER JOIN of two tables and grouping, but the best thing I can do is to query each table separately and then apply a union and further GROUP BY. This seems a bit convoluted, but with joins the counts get messed up, and I am not sure what other method I can use to query this efficiently.

Correct code, but probably inefficient:

SELECT
  x.id,
  sum(x.question_count) AS question_count,
  sum(x.card_count) AS card_count
FROM (
  SELECT
    c.id,
    count(*) AS question_count,
    0 AS card_count
  FROM
    concepts AS c
    INNER JOIN questions ON c.id = questions."conceptId"
  GROUP BY
    c.id
  UNION ALL
  SELECT
    c.id,
    0 AS question_count,
    count(*) AS card_count
  FROM
    concepts AS c
    INNER JOIN cards ON c.id = cards."conceptId"
  GROUP BY
    c.id) AS x
GROUP BY
  x.id
ORDER BY x.id;

output:

id q_count c_count
1 1 2
2 7 9
3 1 1

My hopeful join code, that gives incorrect counts:

SELECT
  x."conceptId",
  q_count,
  c_count
FROM (
  SELECT
    q."conceptId",
    count(*) AS q_count
  FROM
    questions AS q
  GROUP BY
    q."conceptId") AS x
  INNER JOIN (
    SELECT
      c."conceptId",
      count(*) AS c_count
    FROM
      questions AS c
    GROUP BY
      c."conceptId") AS y ON x."conceptId" = y."conceptId";

output:

id q_count c_count
1 1 1
2 7 7
3 1 1
GMB
  • 216,147
  • 25
  • 84
  • 135
StefanM
  • 63
  • 1
  • 11

1 Answers1

1

You would typically aggregate the related table first, then join them to the base table:

select co.id, qu.question_count, ca.card_count
from concepts co
inner join (
    select conceptId, count(*) question_count
    from questions
    group by conceptId
) qu on qu.conceptId = co.id
inner join (
    select conceptId, count(*) card_count
    from cards 
    group by conceptId
) ca on ca.conceptId = co.id

Depending on the availability of your data in the related tables, you might want left joins instead.

Depending on your data as well (typically, if you have few concepts and many questions and cards), it might be more efficient to use correlated subqueries or lateral joins, like so:

select co.id, qu.question_count, ca.card_count
from concepts co
cross join lateral (select count(*) question_count from questions qu where qu.conceptId = co.id) qu
cross join lateral (select count(*) card_count     from cards     ca where ca.conceptId = co.id) ca
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Oh I didn't know about `CROSS JOIN LATERAL` [very useful reading here](https://stackoverflow.com/questions/28550679/what-is-the-difference-between-a-lateral-join-and-a-subquery-in-postgresql) for whoever looks at this later. Thank you very much! – StefanM Mar 10 '23 at 17:02