-2

I AM FULLY AWARE OF THIS QUESTION AND ANSWER: https://stackoverflow.com/a/4796911/2307520 BUT THIS IS A BIT DIFFERENT PROBLEM!

Yes, I understand how they simulated simple FULL OUTER JOIN in MySQL using union and two left joins, but my situation is a bit more complicated (3 tables instead of just 2) and I struggle finding a way how to modify that solution for my needs.

This is simplified database schema: enter image description here

 CREATE TABLE item (`i_id` int, `i_name` varchar(70), b_id int null , s_id int null , PRIMARY KEY (i_id));
 CREATE TABLE box (`b_id` int, `b_name` varchar(70), s_id int null , PRIMARY KEY (b_id) );
 CREATE TABLE shelf (`s_id` int, `s_name` varchar(70) , PRIMARY KEY (s_id));
 INSERT INTO shelf VALUES(1,'shelf1');
 INSERT INTO shelf VALUES(2,'shelf2');
 INSERT INTO shelf VALUES(3,'empty shelf');
 INSERT INTO box VALUES(1,'box in s1 with item in it',1);
 INSERT INTO box VALUES(2,'box without shelf',NULL);
 INSERT INTO box VALUES(3,'empty box',2);
 INSERT INTO item VALUES(1,'item a',1,NULL);
 INSERT INTO item VALUES(2,'item b',1,NULL);
 INSERT INTO item VALUES(3,'item c',2,NULL);
 INSERT INTO item VALUES(4,'item in a shelf',NULL,1);
 INSERT INTO item VALUES(5,'item without location',NULL,NULL);

item can be either in a box, standalone in a shelf, or in neither (not defined). Box can be in a shelf or not defined.

Ideally, I would use SQL query like this: (If I had FULL OUTER JOIN in MySQL)

SELECT *,coalesce(item.b_id,box.b_id) as b_id,coalesce(item.s_id,box.s_id,shelf.s_id) as s_id
FROM item
FULL OUTER JOIN box ON box.b_id=item.i_id
FULL OUTER JOIN shelf ON (shelf.s_id=item.s_id) OR (shelf.s_id=box.s_id)
WHERE ...

That would list all entries for certain criteria (like to see in which shelf a box is if any and all items in it, or anything else)

Community
  • 1
  • 1
David162795
  • 1,846
  • 3
  • 15
  • 20

1 Answers1

1

I think you could pull this off LEFT OUTER JOINs and some unions like in your example question. It's just a bit more complicated since you have an extra table, and therefore, more scenarios to account for.

As I see it, you can reduce this down to four scenarios:

  1. Item exists, and may or may not be in a box that may or may not be on a shelf, but is not directly on a shelf. This will account for all standalone items that aren't in a box and aren't directly on a shelf, as well as items in a box regardless of whether they are on a shelf or not.
  2. Item exists, but is not in a box but is directly on a shelf.
  3. An empty box exists and may or may not be on a shelf. (If it had an item in it, then we would have covered it in scenario 1)
  4. An empty shelf exists.

/*item may or may not be in a box which may or may not be on a shelf, but the item is not directly on a shelf*/

SELECT item.i_id item.i_name, item.b_id, box.b_name, box.s_id, s.s_name FROM left outer join box on item.b_id = box.b_id left outer shelf on box.s_id = shelf.s_id and item.s_id is null

UNION ALL

/*item is directly on the shelf, but aren't in a box*/
SELECT item.i_id, item.i_name, NULL, NULL, shelf.s_id, shelf.s_name FROM item INNER JOIN shelf on item.s_id = shelf.s_id AND item.b_id IS NULL;

UNION ALL

/*empty box may or may not be on the shelf*/
SELECT NULL, NULL, box.b_id, box.b_name, shelf.s_id, shelf.s_name FROM box LEFT OUTER JOIN shelf on box.s_id = shelf.s_id WHERE box IS NOT IN (SELECT DISTINCT b_id FROM item)

UNION ALL

/*Shelfs where there are no boxes nor items sitting on them*/
SELECT NULL, NULL, NULL, NULL, shelf.s_id, shelf.s_name FROM shelf WHERE s_id NOT IN (SELECT distinct s_id FROM item) AND s_id NOT IN (SELECT DISTINCT s_id FROM box);

I think scenario 1 and 2 could be combined with a coalesce, but I like the idea of separating it out since it feels more explicit what we are going after.

Ultimately this feels a bit futile though. Really you should write a SQL statement to go directly after what you are after, like "Give me every shelf that an item is on regardless of whether it's in a box", or "Give me all shelves that have less than 20 items". Instead this just spits out every possible combination of item, box, and shelf, which if referred to by every downstream SQL statement, will not be performant.

JNevill
  • 46,980
  • 4
  • 38
  • 63
  • Presented code have several tiny problems in it, but the core idea helped a lot. Corrected code: http://pastebin.com/QBiyRrnR (and I replaced sub-queries with more joins as joins are usually better at performance) – David162795 Jan 09 '17 at 22:25