0

I have the following question:

Find the names of employees who work on all the projects controlled by department number 5.

The tables I have are Employees (SSN), Works_On(SSN, PNumber), Projects(PNumber, DNumber).

I'm trying to figure out the "ALL" part. I tried different methods, but the thing I'm trying to do is create a canonical query tree, so I need to make sure the SQL query can be converted into that.

I've tried the following:

SELECT E.Fname, E.Lname
FROM Employee AS E, works_on AS W
WHERE E.SSN IN ALL
(
SELECT SSN
FROM Projects AS P
WHERE W.SSN = E.SSN AND W.PNUMBER = P.PNUMBER AND P.DNUMBER = 5
);

But I'm unsure it will work since I used IN ALL. One more thing I tried is getting the count of the works on and group by employee, and getting the count of the projects.

Eventually I found this:

select fname, lname
from employee
where not exists ( (select pnumber from project where dnum = 5)
                   MINUS 
                   (select pno from works_on where essn = ssn)
                 );

But I wasn't able to convert it into a tree.

Do you have any suggestions that I might use? And yes it's Homework.

thethiny
  • 1,125
  • 1
  • 10
  • 26
  • Can you find employees who don't? – Strawberry May 03 '18 at 17:00
  • (SSN is a poor primary key in my view) – Strawberry May 03 '18 at 17:02
  • @Strawberry we're using a sample database. But yes if you have a solution for finding Employees who DON'T, and then we subtract from all employees then it's fine. – thethiny May 03 '18 at 17:06
  • 2
    I do, but it's not my homework – Strawberry May 03 '18 at 17:08
  • @Strawberry hahaha okay I got you, so you were just hinting at what I can do. If I can't find All employees then how can I find Not All? – thethiny May 03 '18 at 17:09
  • So you need to find the employees that work on every (ALL) project? Not just one of them? – jbrahy May 03 '18 at 17:16
  • @jbrahy yes every project. – thethiny May 03 '18 at 17:19
  • This use of "all" is called "relational division". (And don't bother to use "all" in "return all tuples where...".) Always google many clear, concise, specific versions/variants/phrasings of your question/problem/goal/desiderata with & without your specific strings/names/code and read many answers from many questions, which should inform your further googling. If you don't find an answer after applying what you learned and repeating this then ask a question. Use the most frequently productive keywords as tags. Use the best search as a title. – philipxy May 03 '18 at 21:23
  • Please read & act on hits googling 'stackexchange homework'. Also [mcve] re explaining--and finding--your desired query. What does "a canonical query tree" mean? MySQL has no "MINUS" (SQL EXCEPT), google how to express it via IN, EXISTs and/or LEFT JOIN. PS [Is there any rule of thumb to construct SQL query from a human-readable description?](https://stackoverflow.com/a/33952141/3404097) – philipxy May 03 '18 at 21:31
  • @Strawberry Constraints are not needed to query. [But yeah SSNs in real life.](https://stackoverflow.com/a/45437673/3404097) – philipxy May 03 '18 at 21:32
  • What if department number 5 has no projects? It's not clear what you want. In formal settings we interpret that "all" as meaning every employee would work on all its projects. Be clear about what you want the table to hold no matter what the circumstances are. – philipxy May 04 '18 at 02:36
  • @thethiny Null is not a number. Presumaby it's not a department number. You are not explicit about how your table represents situations. So you haven't said anything that suggests null can be in the PNumber column. Can it appear in column SSN? You don't say. Your question is not clear about what input can be & what is desired output as a function of input. See [mcve]. PS You talk about "converting" a query into a "canonical" "tree". That is not clear but it's also not clear what it means or what it has to do with the rest of the question. – philipxy May 04 '18 at 18:42

2 Answers2

1
DROP TABLE IF EXISTS employees;

CREATE TABLE employees (employee_id SERIAL PRIMARY KEY);

DROP TABLE IF EXISTS employee_project;

CREATE TABLE employee_project (employee_id INT NOT NULL, project_id INT NOT NULL,PRIMARY KEY(employee_id,project_id));

DROP TABLE IF EXISTS projects;

CREATE TABLE projects(project_id SERIAL PRIMARY KEY, department_id INT NOT NULL);

INSERT INTO employees VALUES (101),(102),(103);

INSERT INTO employee_project VALUES (101,5004),(101,5005),(101,5006),(101,5007),(102,5004),(102,5007),(102,5008),(103,5006),(103,5007),(103,5008);

INSERT INTO projects VALUES (5004,1),(5005,2),(5006,5),(5007,5),(5008,4),(5009,3);

Consider the following:

SELECT * 
  FROM employees e 
  JOIN projects p 
  LEFT 
  JOIN employee_project ep 
    ON ep.employee_id = e.employee_id 
   AND ep.project_id = p.project_id 
 WHERE p.department_id = 5;
+-------------+------------+---------------+-------------+------------+
| employee_id | project_id | department_id | employee_id | project_id |
+-------------+------------+---------------+-------------+------------+
|         101 |       5006 |             5 |         101 |       5006 |
|         102 |       5006 |             5 |        NULL |       NULL |
|         103 |       5006 |             5 |         103 |       5006 |
|         101 |       5007 |             5 |         101 |       5007 |
|         102 |       5007 |             5 |         102 |       5007 |
|         103 |       5007 |             5 |         103 |       5007 |
+-------------+------------+---------------+-------------+------------+

From this result, we can make two observations, either of which could prove helpful in solving the problem.

  1. The number of distinct projects in the result is equal to the number of distinct projects listed alongside users 101 and 103 (i.e., the number of times those numbers appear in column 4).
  2. User 102 has a null result, which means that they're not involved in all of that department's projects.
Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • Okay so what you did here was merge EVERY employee with EVERY project, and selecting all where they do not have a NULL. Sounds great! Thanks! – thethiny May 03 '18 at 17:30
  • @thethiny Again--If a null can be in a permitted result then you have not clearly posed the question. – philipxy May 04 '18 at 02:38
  • @philipxy my question simply needed a solution that can be converted into a query tree. I didn't mention that NULLs aren't permitted. – thethiny May 04 '18 at 14:54
1

I think group_concat() is the simplest solution. Assuming no duplicates, you can get all the projects for department 5 as:

select group_concat(p.pnumber order by p.pnumber)
from projects p
where dnumber = 5;

Then, you can do the same thing for the employees and match them:

select e.ssn, group_concat(wo.pnumber order by wo.pnumber)
from works_on wo
where wo.pnumber in (select p.pnumber from projects p where p.dnumber = 5)
group by e.ssn;

And finally match the two, such as in a having clause:

select e.ssn, group_concat(wo.pnumber order by wo.pnumber) as projects
from works_on wo
where wo.pnumber in (select p.pnumber from projects p where p.dnumber = 5)
group by e.ssn
having projects = (select group_concat(p.pnumber order by p.pnumber)
                   from projects p
                   where dnumber = 5
                  );

Another approach using relational logic would go as:

select e.ssn
from employees e cross join
     projects p left join
     works_on wo
     on e.ssn = wo.ssn and p.pnumber = wo.pnumber
where p.dnumber = 5
group by e.ssn
having count(wo.snn) = count(*);   -- no NULL values
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786