0

I have just been looking at some example database queries, and came across this :

Find the identifier, name & address of employees of the Research department.

Method I: Join Method

SELECT Ssn, FName, LName, Address
FROM EMPLOYEE, DEPARTMENT
WHERE Dno = Dnumber
AND Dname = ‘Research’

Method II: Subquery Method

SELECT Ssn, FName, LName, Address
FROM EMPLOYEE
WHERE Dno IN
( SELECT Dnumber
FROM DEPARTMENT
WHERE Dname = ‘Research’ );

In these examples, why can you not leave out the Dno = Dnumber line? How do you know when to include this?

Jason McCreary
  • 71,546
  • 23
  • 135
  • 174
John Conrad
  • 313
  • 1
  • 3
  • 5
  • 1
    @John, where did you get those ugly looking curly quotes from? – Johan May 03 '11 at 21:25
  • @Johan, this syntax `SELECT Ssn, FName, LName, Address FROM EMPLOYEE, DEPARTMENT WHERE Dno = Dnumber` is depreciated since SQL'89, it is much better to uses explicit join syntax cf ansi '92. See `wikipedia:join` – Johan May 03 '11 at 21:31
  • No, this is not homework...the quotes were in my notes! :) – John Conrad May 03 '11 at 21:33
  • @John, it's OK to do homework on SO. Some people mind some people don't. and besides I know the textbook that example came from. – Johan May 03 '11 at 21:39
  • @Johan It's important for answering the question properly... if the question is looking for an "academic" answer we may have a different approach than a "real world" solution. – Matthew May 04 '11 at 14:16

7 Answers7

2

John, you are using implicit join syntax cf ansi SQL '89.

WHERE JOIN

SELECT Ssn, FName, LName, Address
FROM EMPLOYEE, DEPARTMENT
WHERE Dno = Dnumber
AND Dname = 'Research'

You should never ever use that because it is confusing as hell.
And it causes a lot of errors because it does a cross join if you're not careful.
The following syntax using explicit joins cf ANSI SQL '92 which is much clearer.

SELECT Ssn, FName, LName, Address
FROM EMPLOYEE
inner join DEPARTMENT on (employee.dnumber = department.dno)
WHERE Dname = 'Research'

This also answers why you cannot leave out the dnumber = dno, because that's the join condition

SUBQUERY
A subquery is really a join by other means.
In general you should avoid a subquery because a join is faster (90% of the time)

Some people find subqueries easier to understand. I would say that if you don't grok joins stay away from SQL!
Still sometimes you're doing something to complex or bizarre for a join and then the subquery is an option.

Back to your question: because the subquery is really a join by other means you need that join condition to make the join :-).

Johan
  • 74,508
  • 24
  • 191
  • 319
  • If you look at the exam paper linked in [this question](http://stackoverflow.com/questions/5874733/when-to-use-a-left-outer-join), you will notice the style you insist he "should never ever use" is the very same as the one used in the exam paper. If it were me sitting this exam and I wanted to maximise me score (rather than be a smart alec as usual :) I'd adopt the same style. – onedaywhen May 04 '11 at 10:14
  • @onedaywhen, It's officially depreciated by the ANSI comity in 1992! How long has this exam comity been sleeping? I'm not surprised universities can be so behind the times. I remember doing an audit exam in uni case being a software company, where the official key point was that the chief security officer should guard the "master tape" of the software. Which of course is silly because in software all copies are identical. – Johan May 04 '11 at 10:30
  • "It's officially depreciated by the ANSI [committee] in 1992" -- if it was then it didn't make it to the ISO Standard. The join style in question is compliant with SQL-92, SQL-99 and SQL:2003. I rather think you are confusing style and fashion with ANSI/ISO Standards. – onedaywhen May 04 '11 at 10:34
  • ... you can test it for yourself using the [Mimer SQL-92 online validator](http://developer.mimer.com/validator/parser92/index.tml). – onedaywhen May 04 '11 at 10:36
  • @onedaywhen, it's in ANSI '89 and it's been kept for backwards compatibility. But sorry it's not depreciated it still very bad form and prone to errors and confusing as hell. (Accidental full outer joins and such). – Johan May 04 '11 at 11:18
  • @Johan: I largely agree with you (though it's not nearly as confusing/error prone as NULL and 3VL -- do you **never ever** use that too? :) The SO community in general are so anti 'classic inner join style' that I think I would risk downvotes by using it. However, a lot of people outside of SO still use it, notably authors and teachers of SQL. If you always have to translate some else's preferred style into your own then you are at a disadvantage. I think that forcing your professor or exam marker (often the same person!) to translate from your style to theirs introduces risk of losing marks. – onedaywhen May 04 '11 at 12:50
  • Accidental cartesian/cross joins rather than full outer joins I would have thought. I agree though and whilst I wouldn't downvote an answer using this style I would be much more likely to upvote an alternative answer that uses the `JOIN` syntax. – Martin Smith May 04 '11 at 16:11
1

I would actually write something like this.

SELECT 
   Ssn, FName, LName, Address
FROM 
   EMPLOYEE
LEFT JOIN 
   DEPARTMENT ON EMPLOYEE.Dno = DEPARTMENT.Dnumber
WHERE 
   Dname = "Research"
tereško
  • 58,060
  • 25
  • 98
  • 150
  • Wouldn't that print all employees though, and thats not what we want? Why would this be preferable? I'm not saying it isn't better, just curious! – John Conrad May 03 '11 at 21:36
  • You could try and see .. It **should** print all the employees from "Research department" , if i guess your DB structure correctly. – tereško May 03 '11 at 21:39
1

The Dno = Dnumber is used to bridge the JOIN between the Employee and Department tables. You need it to identify the relationship between tables you are joining.

There are many ways to write JOIN statements.

This is a good tutorial about joins - http://mysqljoin.com/

Jason McCreary
  • 71,546
  • 23
  • 135
  • 174
1

Each employee belongs to a department. Dno = Dnumber is what defines the relationship between the two. So you have to keep that relationship in your join. Dname = 'Research' further filters to only include Research department employees.

Should you not join Dno to Dnumber, you will wind up with a Cartesian product.

Shan Plourde
  • 8,528
  • 2
  • 29
  • 42
0

You use the Dno = Dnumber line to basically create JOIN criteria.

If you didn't use them you'd have a full JOIN However, I would contend that the correct way to do the JOIN is to actually do

SELECT 
    EMPLOYEE.[Ssn]
    , EMPLOYEE.[FName]
    , EMPLOYEE.[LName]
    , EMPLOYEE.[Address]
FROM 
    EMPLOYEE
JOIN 
    DEPARTMENT 
        ON DEPARTMENT.[Dnumber] = EMPLOYEE.[Dno]    
WHERE 
    DEPARTMENT.[Dname] = 'Research'
Matthew
  • 10,244
  • 5
  • 49
  • 104
0

The "Dno = Dnumber" line is the join clause - it tells the query only to include records in the employee table whose Dno matches the department number in the department table.

Neville Kuyt
  • 29,247
  • 1
  • 37
  • 52
0

The Dno = Dnumber clause is needed for you to join the two tables. Without this, you'll get what's called a Cartesian join, where you'll get n x m number of rows returned, where n = # rows in the EMPLOYEE table, and m = # rows in the DEPARTMENT table.

You can find a good tutorial on SQL join at http://www.1keydata.com/sql/sqljoins.html.

nernet
  • 1