0

I came across the following SQL question about the famous HR schema which ships with Oracle:

Write a SQL query to find all the locations and the departments for each location along with the locations that do not have department. Don't use OUTER JOIN.

With OUTER JOIN it is easy:

SELECT department_name, city 
FROM locations
LEFT OUTER JOIN departments USING (location_id);

I get 43 results. I tried with this:

SELECT department_name, city 
FROM locations l, departments d
WHERE l.location_id = d.location_id OR 
      l.location_id NOT IN (SELECT DISTINCT location_id FROM departments);

enter image description here

but I get only 27 rows as if I do:

SELECT department_name, city  
FROM locations 
JOIN departments USING (location_id);

Why does the OR not work? Thanks!


EDIT:

As pointed out by @mathguy, I had a department with NULL in the location_id column in the departments table. That is why the NOT IN returns no row. Otherwise I would have many more rows looking for the location id from the departments table.

Martin Dimitrov
  • 4,796
  • 5
  • 46
  • 62
  • Probably the bad query is the first one, not the second..Left outer join should use ON, not USING – Massimo Petrus Nov 01 '16 at 14:15
  • @Massimo, why? It does produce the desired results. – Martin Dimitrov Nov 01 '16 at 14:20
  • http://docs.oracle.com/database/121/SQLRF/statements_10002.htm#i2107296 – Massimo Petrus Nov 01 '16 at 14:24
  • @Massimo read further down "Because the column names in this example are the same in both tables in the join, you can also use the common column feature by specifying the USING clause of the join syntax." – Martin Dimitrov Nov 01 '16 at 14:28
  • You're right, i did'nt know...i'm linked to 'traditional' sql – Massimo Petrus Nov 01 '16 at 15:05
  • Are you sure about the number of rows from the second query (the one that's wrong)? I get 43 rows from the first query and 27 from the last query, just like you said - but from the middle one I don't get 27, I get **459** rows. Please double-check (I simply copied your query and pasted in my copy of Oracle, logged in as `hr`, so I don't think it's on my end). –  Nov 01 '16 at 15:16
  • @mathguy Yes, I am absolutely sure. The copy of my `HR` schema is pretty much unchanged so I doubt the problem is from corrupt data. – Martin Dimitrov Nov 01 '16 at 15:20
  • @MartinDimitrov - will you do me a favor and try, anyway? Or would you like me to explain why that's not possible first? It would be a lot easier if you would just copy and paste what you posted, and try. I have no doubt the table is OK; I don't believe THAT QUERY returns only 27 results. –  Nov 01 '16 at 15:31
  • @mathguy, I will do an update with a screenshot. But basically I really do copy and paste. – Martin Dimitrov Nov 01 '16 at 15:33
  • Hint: When `location_id` is NOT IN (...), and there are 16 of those, the WHERE condition is satisfied, so you get at least one row for each pair (department, city of such location_id). –  Nov 01 '16 at 15:34
  • OK, I see the screenshot, but that is really odd. With your second query you should get 27 rows from the match, PLUS 16 * 27 rows for the cities whose location_id is NOT in the departments table. Where did those rows go? –  Nov 01 '16 at 15:40
  • You may want to check the DEPARTMENTS table - do you have a department with NULL in the location_id column? That would explain the difference; if you have a NULL in the NOT IN (...) list, then the NOT IN condition returns no rows. –  Nov 01 '16 at 15:43

4 Answers4

2
 select l.city, d.department_name from locations l, departments d
 where l.location_id=d.location_id
 UNION
 select l2.city, null department_name
 from locations l2
where not exists (select 1 from depertments d2 where   d2.location_id=l2.location_id) 
Massimo Petrus
  • 1,881
  • 2
  • 13
  • 26
2

Get the city, department_name from the inner join, then union all like so:

select city, department_name 
  from <inner join>
union all
select city, NULL
  from locations
 where location_id not in (select location_id from departments);

The second branch of union all will give you the 16 cities with no departments located there.

NOTE: The NOT IN condition will not work if there are departments with NULL in the location_id column. If that is possible, the condition can be altered to not in (select location_id from departments where location_id is not null).

  • Hmm, `select city, NULL from locations where location_id not in (select location_id from departments);` returns 0 rows. – Martin Dimitrov Nov 01 '16 at 15:31
  • It must be the same problem: you must have a department with NULL for location_id. This (I mean your copy) is not the standard HR schema. –  Nov 01 '16 at 15:50
  • Yes. I do. Thanks a lot! – Martin Dimitrov Nov 01 '16 at 15:50
  • Or the HR schema changed between versions of Oracle (I have 12.1) but I doubt it. –  Nov 01 '16 at 15:51
  • @MartinDimitrov - I edited the answer to explain how to modify the NOT IN condition in case there may be NULLs in that column. –  Nov 01 '16 at 15:53
1

Use a correlated subquery:

SELECT l.*,
       (SELECT d.department_name
        FROM departments d
        WHERE d.location_id = l.location_id
       ) as department_name
FROM locations l;

If there is a concern about multiple departments in a location, then just aggregate them together:

SELECT l.*,
       (SELECT LISTAGG(d.department_name, ', ') WITHIN GROUP (ORDER BY d.department_name) as department_names
        FROM departments d
        WHERE d.location_id = l.location_id
       ) as department_name
FROM locations l;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    This won't work because there might be more than one department in a location thus returning multiple rows from the subquery. – Martin Dimitrov Nov 01 '16 at 14:36
  • @MartinDimitrov A join is going to return multiple rows as well? This shows you how to write the query without an OUTER JOIN, I've explained why the OR didn't work in my answer. I don't believe they're looking for you to use implicit syntax. – Derrick Moeller Nov 01 '16 at 14:50
  • Have you tried it? It returns `ORA-01427: single-row subquery returns more than one row` – Martin Dimitrov Nov 01 '16 at 14:52
  • I like the use of `LISTAGG`. I would never think of it. Thank you. – Martin Dimitrov Nov 01 '16 at 16:17
0

Well It's simple if you try to imagane some intermediate steps:

lets imagane cross join of table and after it filter the result with your where clause. In that rowset not any rows with not null locations and null departments will be presented. You may prove it with next query:

SELECT department_name, city 
FROM locations l, departments d
WHERE d.location_id is null;

no row selected

But you specify l.location_id NOT IN (SELECT DISTINCT location_id FROM departments);. This condition not influent to cross join between table.
Because of it you should get extra rows for loction without departments.

And thats why you need left join or union clause.

Michael Piankov
  • 1,989
  • 1
  • 8
  • 19