3

In the query I am fetching data from three tables : company, classes_by_company, and person. I have a foreign key in all tables with the name company_id. I use a left join to mesh the tables with a matching company_id. I am trying to find out the amount of classes per company and employees. I am getting values that are not correct for both. Here is a SQIDDLE

SELECT a.id, 
a.company_id,
a.status,
COUNT(c.company_id) AS classes_per_company,
COUNT(p.employee_id) AS employees_per_company
FROM company a
LEFT JOIN classes_by_company c
ON a.company_id = c.company_id
LEFT JOIN person p
ON a.company_id = p.company_id
GROUP BY a.company_id

Table structure:

CREATE TABLE company
    (
     id int auto_increment primary key,
     company_id int,
     status varchar(20)
    );
CREATE TABLE classes_by_company
(
 id int auto_increment primary key,
 company_id int,
 class_name varchar(20)
);
CREATE TABLE person
(
 id int auto_increment primary key,
employee_id int,
 company_id int,
 person_name varchar(20)
);

enter image description here

2 Answers2

1

i think you need to COUNT(DISTINCT c.company_id) since your data has entries of the same company_id and class

SELECT a.id, 
       a.company_id,
       a.status,
       COUNT(DISTINCT c.id) AS classes_per_company,
       COUNT(DISTINCT p.employee_id) AS employees_per_company
FROM company a
LEFT JOIN classes_by_company c ON a.company_id = c.company_id
LEFT JOIN person p ON a.company_id = p.company_id
GROUP BY a.id,a.company_id,a.status

according to your updated question: you only need COUNT(DISTINCT p.employee_id)

also i think you should COUNT(DISTINCT c.id) instead of COUNT(c.company_id) since you're trying to count classes and not companies.

Tin Tran
  • 6,194
  • 3
  • 19
  • 34
  • An ANSI standard statement would be nice. – Kermit Jan 06 '14 at 03:24
  • @FreshPrinceOfSO What do you mean by that? I am keen to know. – Tin Tran Jan 06 '14 at 03:35
  • Your answer (by MySQL's fault) is not ANSI standard as you're `SELECT`ing columns which are not in your `GROUP BY`. [Documentation](http://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html) and [SO Question](http://stackoverflow.com/questions/7594865/why-does-mysql-add-a-feature-that-conflicts-with-sql-standards). If you will ever work on a different SQL platform, you will have bad habits to kick. – Kermit Jan 06 '14 at 03:36
  • @TinTran but it still shows 1 class for `company_id = 7`. Check the [SQLFIDDLE](http://sqlfiddle.com/#!2/803440/5) –  Jan 06 '14 at 03:40
  • @FreshPrinceOfSO that's what you mean, i never select columns that aren't in the group by. I was just in a hurry to post an answer. I'll change my answer to make it proper. – Tin Tran Jan 06 '14 at 03:49
  • Yes I skipped over that important line. I will use `COUNT(DISTINCT c.id)` since I am counting by classes. Thanks! –  Jan 06 '14 at 04:05
  • I think the other answer is a better anyways :) – Tin Tran Jan 06 '14 at 04:22
1

I have to ask - why does company have both company_id and id? Shouldn't one of them be sufficient? Regardless of that, your problem is due to the multiplicative nature of joins - that is, if there are 2 employees at a company, and 2 classes, you end up with 4 rows - 1 for each combination (person A + class A, person A + class B, person B + class A, person B + class B). Due to this, here's the other way the issue is usually solved:

SELECT a.id, a.company_id, a.status,
       c.count AS classes_per_company,
       p.count AS employees_per_company
FROM company a
LEFT JOIN (SELECT company_id, COUNT(*) as count
           FROM classes_by_company
           GROUP BY company_id) c
       ON a.company_id = c.company_id
LEFT JOIN (SELECT company_id, COUNT(*) as count
           FROM person
           GROUP BY company_id) p
       ON a.company_id = p.company_id

(and the resulting fiddle - thanks for providing one!)

Note that, while both answers give the correct result, this version is likely to be more performant, as it has a higher chance to be using indices.

Clockwork-Muse
  • 12,806
  • 6
  • 31
  • 45
  • I understand now. Well i was using `id` as just for auto increment and `company_id` as the unique company id but as it being store it doesn't necessarily go in numerical order. –  Jan 06 '14 at 04:01
  • 1
    Ignore "numerical order" out of autogenerated id columns - SQL as a standard essentially states that rows have no inherent order (that's the whole purpose for `ORDER BY`) - things like indices are optimizations. Doing so probably violates the rule of assigning meaning to surrogate keys. If you need to know which came first, use a timestamp column (set to UTC). You should generally only need one internal key per table, either a single column or a composite key (like for cross-reference tables - what happens if a person works for more than one company?). – Clockwork-Muse Jan 06 '14 at 04:12
  • Good Point! I am glad I know this now. I will apply it. –  Jan 06 '14 at 04:37