INNER JOIN and Count POSTGRESQL

Question

I am learning postgresql and Inner join I have following table. Employee

Id Name    DepartmentId
1  John S.    1
2  Smith P.   1
3  Anil K.    2

Department

Department
Id Name
1  HR
2  Admin

I want to query to return the Department Name and numbers of employee in each department.

SELECT Department.name , COUNT(Employee.id) FROM Department INNER JOIN Employee ON Department.Id = Employee.DepartmentId Group BY Employee.department_id;

I dont know what I did wrong as I am new to database Query.

have you tried `Group BY Employee.department_id,Employee.Id` — MalcolmInTheCenter, May 19 '21 at 05:24
Sorry my bad I tried wrong table name but will it solve my problem by mine query? — lord stock, May 19 '21 at 05:25
Table definition, error message and Postgres version would be instrumental for any such question. — Erwin Brandstetter, May 19 '21 at 05:36

Erwin Brandstetter · Accepted Answer · 2022-04-03T22:36:31.560

3

When involving all rows or major parts of the "many" table, it's typically faster to aggregate first and join later. Certainly the case here, since we are after counts for "each department", and there is no WHERE clause at all.

SELECT d.name, COALESCE(e.ct, 0) AS nr_employees
FROM   department d
LEFT   JOIN (
   SELECT department_id AS id, count(*) AS ct
   FROM   employee
   GROUP  BY department_id
   ) e USING (id);

Also made it a LEFT [OUTER] JOIN, to keep departments without any employees in the result. And COALESCE to report 0 employees instead of NULL in that case.

Related, with more explanation:

Query with LEFT JOIN not returning rows for count of 0

Your original query would work too, after fixing the GROUP BY clause:

SELECT department.name, COUNT(employee.id)
FROM   department
INNER  JOIN employee ON department.id = employee.department_id
Group  BY department.id;  --!

That's assuming department.id is the PRIMARY KEY of the table, in which case it covers all columns of that table, including department.name. And you may want LEFT JOIN like above.

Aside: Consider legal, lower-case names exclusively in Postgres. See:

Are PostgreSQL column names case-sensitive?

edited Apr 03 '22 at 22:36

answered May 19 '21 at 05:27

Erwin Brandstetter

605,456
145
1,078
1,228

. . Why is it faster to aggregate first? If there is an index on `department`, I can see that it is true. But if the join is an outer join or if the inner join filters out a significant number of rows, then I would expect the join first to be faster. – Gordon Linoff May 19 '21 at 12:06
@GordonLinoff: Why is it faster? Short answer: try it. Long answer is short, too: because there are vastly fewer join operations after condensing the many rows down to a fraction with counts. True, for a selective predicate, it's typically cheaper to join first (or use various alternative query techniques), but this query retrieves counts for *"each department"*. No `WHERE` clause. The first variant will be substantially faster. – Erwin Brandstetter May 19 '21 at 18:53
But before the `group by`, the join operations can use an index on `employee(departmentid)`. Your statement also sounds more like an absolute (i.e. this is almost always faster) and not specific to this particular query. That's why I'm asking. – Gordon Linoff May 19 '21 at 19:04
Oh, not absolute. The statement is in response to this particular question, I'll make that clearer. It applies when all rows or a major part of the "many" table is involved. (Some other details matter, too ...) – Erwin Brandstetter May 19 '21 at 19:11
. . Phew. I thought I might be missing something. Cheers. – Gordon Linoff May 19 '21 at 19:12

INNER JOIN and Count POSTGRESQL

1 Answers1