-1

I have three tables that describe data where there are some main things. Each thing can have several categories below it, and each category can have multiple segments.

main

main_id main_name
1 bla
2 blub

categories

cat_id main_id category_name
4 1 bla cat
5 1 blub cat
6 2 ble cat
7 2 blib cat

segments

seg_id cat_id segment_name
10 4 bla seg
11 4 blub seg
12 5 bli seg
13 6 blob seg

I'm augmenting the main table with a column that shows the number of categories below each main thing, as well as the total number of segments in all categories belonging to it.

main_id main_name n_cats n_segs
1 bla 2 3
2 blub 2 1

I tried with the following SQL statement:

SELECT
    m.main_id, m.main_name, count(c.cat_id) as n_cats, count(s.seg_id) as n_segs
FROM main
    LEFT JOIN categories c ON c.main_id = m.main_id
    LEFT JOIN segments s ON s.cat_id = c.cat_id
GROUP BY m.main_id

But it yields the same values for n_cats and n_segs, instead of different ones as desired:

main_id main_name n_cats n_segs
1 bla 2 2
2 blub 2 2
philipxy
  • 14,867
  • 6
  • 39
  • 83
floz
  • 39
  • 2
  • 1
    The casue of this problem is, in short: After joining categories, you will get two rows for `main_id=1`. When you add another join (to segements) this will start adding up... – Luuk Jan 08 '23 at 14:15
  • 1
    Your issue is not reproducible >> https://www.db-fiddle.com/f/hFXvHSvmLxg3tqL5FcrASG/0. The only difference with your code is in the fields included in the `GROUP BY` clause, yet by removing `m.main_name` from your `SELECT` clause, the same result is yielded. – lemon Jan 08 '23 at 14:17
  • Does this answer your question? [Two SQL LEFT JOINS produce incorrect result](https://stackoverflow.com/questions/12464037/two-sql-left-joins-produce-incorrect-result) – philipxy Jan 09 '23 at 05:54
  • This is a faq. Please before considering posting: Pin down code issues via [mre]. Read the manual/reference & google any error message & many clear, concise & precise phrasings of your question/problem/goal, with & without your particular names/strings/numbers, 'site:stackoverflow.com' & tags; read many answers. Reflect research in posts. SO/SE search is poor & literal & unusual, read the help. Google re googling/searching, including Q&A at [meta] & [meta.se]. [How much research effort is expected of Stack Overflow users?](https://meta.stackoverflow.com/q/261592/3404097) [ask] [Help] – philipxy Jan 09 '23 at 06:00
  • Thank you all for getting back on this, and thank you for the fiddles. They show indeed that my problem must be elsewhere, as it works just as expected here with the minimal examples, but not in my true dataset. So clearly, the problem there must be of a different nature, as the SQL code here confirms my understanding how it should work. – floz Jan 09 '23 at 13:45

2 Answers2

2

I executed you SQL with same data set,

SELECT
    m.main_id, m.main_name, COUNT(c.cat_id) AS n_cats, COUNT(s.seg_id) AS n_segs
FROM main m
    LEFT JOIN categories c ON c.main_id = m.main_id
    LEFT JOIN segments s ON s.cat_id = c.cat_id
GROUP BY m.main_id

I am getting this result, the only difference is I use m as an alias FROM main m

enter image description here

Dickens A S
  • 3,824
  • 2
  • 22
  • 45
1

It looks like you need this SELECT clause using COUNT(DISTINCT col).

SELECT m.main_id, m.main_name, 
       count(DISTINCT c.cat_id) as n_cats, 
       count(DISTINCT s.seg_id) as n_segs

This will undo the combinatorial explosion caused by your JOINs.

O. Jones
  • 103,626
  • 17
  • 118
  • 172