30

I don't understand the concept of a left outer join, a right outer join, or indeed why we need to use a join at all! The question I am struggling with and the table I am working from is here: Link

Question 3(b)

Construct a command in SQL to solve the following query, explaining why it had to employ the (outer) join method. [5 Marks] “Find the name of each staff member and his/her dependent spouse, if any”

Question 3(c) -

Construct a command in SQL to solve the following query, using (i) the join method, and (ii) the subquery method. [10 Marks] “Find the identity name of each staff member who has worked more than 20 hours on the Computerization Project”

Can anyone please explain this to me simply?

always-a-learner
  • 3,671
  • 10
  • 41
  • 81
John Conrad
  • 313
  • 1
  • 3
  • 5
  • 1
    its the "if any"... left outer joins are the same as left joins (in mysql) and you use them to return rows even though they don't necessarily match rows in another table. – Diego May 03 '11 at 19:41
  • And also, you need to use a join because that is, very simply, how database queries work. It is like asking why we need to use lungs to breathe. Joins are the mechanism by which queries to a relational database are made. Joins are not at all a complicated concept and if you are having problems I suggest you do additional beginning reading on relational databases. If you are indeed a student at the University of Cork then you should be able to access this resource: http://proquest.safaribooksonline.com/?xmlId=1-56592-744-3&uiCode=CORK – Jordan Reiter May 03 '11 at 19:55

5 Answers5

86

Joins are used to combine two related tables together.

In your example, you can combine the Employee table and the Department table, like so:

SELECT FNAME, LNAME, DNAME
FROM
EMPLOYEE INNER JOIN DEPARTMENT ON EMPLOYEE.DNO=DEPARTMENT.DNUMBER

This would result in a recordset like:

FNAME   LNAME   DNAME
-----   -----   -----
John    Smith   Research
John    Doe     Administration

I used an INNER JOIN above. INNER JOINs combine two tables so that only records with matches in both tables are displayed, and they are joined in this case, on the department number (field DNO in Employee, DNUMBER in Department table).

LEFT JOINs allow you to combine two tables when you have records in the first table but might not have records in the second table. For example, let's say you want a list of all the employees, plus any dependents:

SELECT EMPLOYEE.FNAME as employee_first, EMPLOYEE.LNAME as employee_last, DEPENDENT.FNAME as dependent_last, DEPENDENT.LNAME as dependent_last
FROM
EMPLOYEE INNER JOIN DEPENDENT ON EMPLOYEE.SSN=DEPENDENT.ESSN

The problem here is that if an employee doesn't have a dependent, then their record won't show up at all -- because there's no matching record in the DEPENDENT table.

So, you use a left join which keeps all the data on the "left" (i.e. the first table) and pulls in any matching data on the "right" (the second table):

SELECT EMPLOYEE.FNAME as employee_first, EMPLOYEE.LNAME as employee_last, DEPENDENT.FNAME as dependent_first, DEPENDENT.LNAME as dependent_last
FROM
EMPLOYEE LEFT JOIN DEPENDENT ON EMPLOYEE.SSN=DEPENDENT.ESSN

Now we get all of the employee records. If there is no matching dependent(s) for a given employee, the dependent_first and dependent_last fields will be null.

Jordan Reiter
  • 20,467
  • 11
  • 95
  • 161
  • Yeah, I realized just as I'd finished my answer that it was basically a homework assignment (I ignored the questions at the bottom and just focused on the first part plus the linked doc). I added the sentence "Tagged as homework" but then saw that someone else had beaten me to the punch so I removed that line. :) – Jordan Reiter May 04 '11 at 04:18
  • It's not homework. Thanks a lot for the answer, very helpful :). – John Conrad May 04 '11 at 09:10
29

example (not using your example tables :-)

I have a car rental company.

Table car
id: integer primary key autoincrement
licence_plate: varchar
purchase_date: date

Table customer
id: integer primary key autoincrement
name: varchar

Table rental
id: integer primary key autoincrement
car_id: integer
bike_id: integer
customer_id: integer
rental_date: date

Simple right? I have 10 records for cars because I have 10 cars.
I've been running this business for 10 years, so I've got 1000 customers.
And I rent the cars about 20x per year per cars = 10 years x 10 cars x 20 = 2000 rentals.

If I store everything in one big table I've got 10x1000x2000 = 20 million records.
If I store it in 3 tables I've got 10+1000+2000 = 3010 records.
That's 3 orders of magnitude, so that's why I use 3 tables.

But because I use 3 tables (to save space and time) I have to use joins in order to get the data out again
(at least if I want names and licence plates instead of numbers).

Using inner joins

All rentals for customer 345?

SELECT * FROM customer
INNER JOIN rental on (rental.customer_id = customer.id)
INNER JOIN car on (car.id = rental.car_id)
WHERE customer.id = 345.

That's an INNER JOIN, because we only want to know about cars linked to rentals linked to customers that actually happened.

Notice that we also have a bike_id, linking to the bike table, which is pretty similar to the car table but different. How would we get all bike + car rentals for customer 345.
We can try and do this

SELECT * FROM customer
INNER JOIN rental on (rental.customer_id = customer.id)
INNER JOIN car on (car.id = rental.car_id)
INNER JOIN bike on (bike.id = rental.bike_id)
WHERE customer.id = 345.

But that will give an empty set!!
This is because a rental can either be a bike_rental OR a car_rental, but not both at the same time.
And the non-working inner join query will only give results for all rentals where we rent out both a bike and a car in the same transaction.
We are trying to get and boolean OR relationship using a boolean AND join.

Using outer joins

In order to solve this we need an outer join.

Let's solve it with left join

SELECT * FROM customer
INNER JOIN rental on (rental.customer_id = customer.id) <<-- link always
LEFT JOIN car on (car.id = rental.car_id) <<-- link half of the time
LEFT JOIN bike on (bike.id = rental.bike_id) <<-- link (other) 0.5 of the time.
WHERE customer.id = 345.

Look at it this way. An inner join is an AND and a left join is a OR as in the following pseudocode:

if a=1 AND a=2 then {this is always false, no result}
if a=1 OR a=2 then  {this might be true or not}

If you create the tables and run the query you can see the result.

on terminology

A left join is the same as a left outer join. A join with no extra prefixes is an inner join There's also a full outer join. In 25 years of programming I've never used that.

Why Left join

Well there's two tables involved. In the example we linked
customer to rental with an inner join, in an inner join both tables must link so there is no difference between the left:customer table and the right:rental table.

The next link was a left join between left:rental and right:car. On the left side all rows must link and the right side they don't have to. This is why it's a left join

Johan
  • 74,508
  • 24
  • 191
  • 319
  • nice :D thanks for nice explanation at point `AND` and `OR` relation :D it makes me understand about different of those. Before that, I don't understand, why people use `LEFT JOIN` when it always take WHOLE tables :D – hqt Nov 03 '13 at 17:28
2

You use outer joins when you need all of the results from one of the join tables, whether there is a matching row in the other table or not.

Oded
  • 489,969
  • 99
  • 883
  • 1,009
  • 4
    Fully explained here: [http://stackoverflow.com/questions/38549/sql-difference-between-inner-and-outer-join](http://stackoverflow.com/questions/38549/sql-difference-between-inner-and-outer-join) – jk. May 03 '11 at 19:45
0

In general: JOIN joints two tables together. Use INNER JOIN when you wanna "look up", like look up detailed information of any specific column. Use OUTER JOIN when you wanna "demonstrate", like list all the info of the 2 tables.

0

I think Question 3(b) is confusing because its entire premise wrong: you don't have to use an outer join to "solve the query" e.g. consider this (following the style of syntax in the exam paper is probably wise):

SELECT FNAME, LNAME, DEPENDENT_NAME
  FROM EMPLOYEE, DEPENDENT
 WHERE SSN = ESSN
       AND RELATIONSHIP = 'SPOUSE'
UNION 
SELECT FNAME, LNAME, NULL
  FROM EMPLOYEE
EXCEPT 
SELECT FNAME, LNAME, DEPENDENT_NAME
  FROM EMPLOYEE, DEPENDENT
 WHERE SSN = ESSN
       AND RELATIONSHIP = 'SPOUSE'
onedaywhen
  • 55,269
  • 12
  • 100
  • 138