1

We have been learning SQL Server programming in Database Systems class. The professor goes exceptionally fast and is not very open to asking questions. I did ask him this, but he just advised me to review the code he'd given (which doesn't actually answer the question).

When making a query, what is the difference between using the term JOIN and using the "=" operator? For example, I have the following query:

SELECT VENDOR_NAME, ITEM_NAME, QTY
FROM   VENDOR, VENDOR_ORDER, INVENTORY
WHERE  VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ID
 AND  VENDOR_ORDER.INV_ID = INVENTORY.INV_ID
ORDER  BY VENDOR_NAME

In class the professor has used the following code:

SELECT DISTINCT CUS_CODE, CUS_LNAME, CUS_FNAME 
FROM CUSTOMER   JOIN INVOICE USING (CUS_CODE)
        JOIN LINE USING (INV_NUMBER) 
        JOIN PRODUCT USING (P_CODE)
WHERE P_DESCRIPT = 'Claw hammer';

It seems to me that using a join is performing the same function as the "=" is in mine? Am I correct or is there a difference that I am unaware of?

Edit: Trying to use Inner Join based on things I've found on Google. I ended up with the following.

SELECT VENDOR_NAME, ITEM_NAME, QTY
FROM   VENDOR, VENDOR_ORDER, INVENTORY
        INNER JOIN VENDOR_ORDER USING (VENDOR_ID)
        INNER JOIN INVENTORY USING (INV_ID)
ORDER  BY VENDOR_NAME

Now I get the error message ""VENDOR_ID" is not a recognized table hints option. If it is intended as a parameter to a table-valued function or to the CHANGETABLE function, ensure that your database compatibility mode is set to 90. " I'm using 2014, so my compatibility level is 120.

Inessaria
  • 79
  • 1
  • 2
  • 10
  • 2
    You are using the old legacy join syntax. Use explicit join syntax like your professor – juergen d Apr 29 '16 at 00:46
  • 2
    Have a look at this, http://stackoverflow.com/questions/334201/why-isnt-sql-ansi-92-standard-better-adopted-over-ansi-89 – Ola Ekdahl Apr 29 '16 at 00:49
  • 2
    That syntax is in a SQL Server class? – ZLK Apr 29 '16 at 00:53
  • So they are doing the same thing, but JOIN is preferred? Also, it will make large queries run slightly faster? – Inessaria Apr 29 '16 at 00:57
  • 2
    `USING` doesn't work in SQL Server, as far as I know. You should be using explicit `JOIN` syntax with an `ON` clause. – Gordon Linoff Apr 29 '16 at 00:57
  • The 2nd one is. The first one is me trying to complete a homework that I don't really understand all that well using my Google-Fu powers. I'm really struggling in the class (we all are) because he is teaching it almost like it is a refresher course rather than the first time we've been exposed to it. He is going through a 30-40 page chapter in 50 minutes. – Inessaria Apr 29 '16 at 00:58
  • I tried using Inner Join (see edit to original), and now it doesnt work. I really don't understand Join apparently. Which is unfortunate as I thought I was starting to. – Inessaria Apr 29 '16 at 01:17

2 Answers2

3

The difference between what you are doing (in your first example) and what your professor is doing is that you are creating a set of all possible combinations of the rows in those tables, then narrowing your results to the ones that match the way you want them to. He is creating a set of only the rows that match the way you want them to in the first place.

If your tables were:

Table1
ID1
1   
2    
3    

Table2
ID2
1    
2    
3    

Your query starts with basically a cross join:

Select * from Table1, Table2

ID1  ID2
1    1
2    1
3    1
1    2
2    2
3    2
1    3
2    3
3    3

Then narrows that result set down by applying the where ID1 = ID2

ID1  ID2
1    1
2    2
3    3

This is inefficient and somewhat difficult to read in more complex examples, as people have mentioned in the comments.

Your professor is building the criteria to relate the two tables into the join itself, so he is effectively skipping the first step. In our example tables, this would be Select * from Table1 join Table2 on ID1 = ID2.

There are several types of joins in SQL, which differ based on how you want to handle cases where a value exists in one of your tables, but has no match in the other table. See traditional venn diagram explanation from http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins: Traditional Venn Diagram

APH
  • 4,109
  • 1
  • 25
  • 36
  • That is a lovely answer and explains it to me very well. So using the following code instead of m original would be correct? SELECT VENDOR_NAME, ITEM_NAME, QTY FROM VENDOR INNER JOIN VENDOR_ORDER ON VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ID INNER JOIN INVENTORY ON VENDOR_ORDER.INV_ID=INVENTORY.INV_ID – Inessaria Apr 29 '16 at 01:35
  • 2
    I disagree. Functionally, and performance wise **there is no difference**. Don't promote this myth. If you look at the query plans between the two syntaxes, they do exactly the same thing. You've also posted my most disliked join explanation figure table which does not explain one of the key concepts of what happens in a one to many join..... you get many records, not one! – Nick.Mc Apr 29 '16 at 01:37
  • @Inessaria - Yes, that looks good (provided you only intend to return rows with a value in all three tables). – APH Apr 29 '16 at 01:38
  • @Nick.McDermaid, I didn't say there was a performance difference - I am just trying to explain, step by step, what the logic is doing. I am sorry you don't like the venn diagrams; please feel free to post your own answer explaining the different types of joins as you see fit. – APH Apr 29 '16 at 01:40
  • "This is inefficient".. maybe you were referring to inefficiently writing code. Again if you look at the query plan you won't see a cross join but I guess you are explaining it conceptually, in which case, I think you've done a pretty good job there :) – Nick.Mc Apr 29 '16 at 01:48
2

Don't worry it's your professors issue not yours. Make sure you give appropriate feedback at the end of the course ;)

Hang in there.

So here is some info:

So the first issue is: your professor should not be teaching you USING because it has limited implementation (it definitely won't work in SQL Server) and IMHO it's a bad idea because you should explicitly list join columns.

Here are some queries that will work in SQL Server - lets build them up bit by bit. I will need to make some assumptions

First just join vendor to vendor order:

SELECT VENDOR.VENDOR_NAME, VENDOR_ORDER.QTY
FROM   VENDOR
INNER JOIN 
VENDOR_ORDER
ON VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ORDER

By using inner join we match these two tables on VENDOR_ID

If you have seven records in VENDOR_ORDER with VENDOR_ID = 7, and one record in table VENDOR then the result of this will be.... 7 records, with the data from the VENDOR table repeating seven times.

Now to that, join in inventory

SELECT VENDOR.VENDOR_NAME, INVENTORY.ITEM_NAME, VENDOR_ORDER.QTY
FROM   VENDOR
INNER JOIN 
VENDOR_ORDER
ON VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ORDER
INNER JOIN
INVENTORY ON INVENTORY.INV_ID = VENDOR_ORDER.INV_ID
ORDER  BY VENDOR.VENDOR_NAME

This 'INNER JOIN' syntax is the modern version (often referred as SQL-92). Having a comma seperated list after the FROM clause is 'old school'

Both methods work the same way but the old school way causes ambiguities if you start using outer joins. So get into the habit of doing it the new way.

Lastly, to neaten things up you can use an 'allias'. Which means you give each table a shorter name then use that. I've also added in the invoice number so you can get an idea of what's going on:

SELECT V.VENDOR_NAME, I.ITEM_NAME, ORD.INV_ID, ORD.QTY
FROM   VENDOR As V
INNER JOIN 
VENDOR_ORDER As ORD
ON V.VENDOR_ID = ORD.VENDOR_ORDER
INNER JOIN
INVENTORY As I ON I.INV_ID = ORD.INV_ID
ORDER  BY V.VENDOR_NAME
Nick.Mc
  • 18,304
  • 6
  • 61
  • 91
  • Thank you, that helps as well. I frequently feel overwhelmed by going so fast and being unable to get sufficient answers to my questions. At least my Google-fu is getting a workout! – Inessaria Apr 29 '16 at 02:00
  • When using an alias, do I have to declare each one? I see where you have "VENDOR as V", but not for the rest. Does SQL just "know"? – Inessaria Apr 29 '16 at 02:01
  • Well there is `VENDOR_ORDER as ORD` and `INVENTORY As I` so these two tables are also aliased – Nick.Mc Apr 29 '16 at 04:35
  • (To answer Inessaria's question, not critique Nick's answer, which is helpful). You don't have to use an alias unless you have columns with the same name in multiple tables; for example, if you only have vendor_ID in your vendor table, SQL will know what you mean without the aliases. You an also use the whole table name when specifying the columns, although most people assign shorter aliases instead. IMO it's a good idea to use aliases even if you don't have to, to make your code clearer. – APH Apr 29 '16 at 05:35
  • I do now see the alias assignments, but they come after the opening line that uses the aliases. In Java I'm used to things being sequential, so using an alias before assigning it would give an error. Is that different in SQL? Does it examine the entire expression before executing it, therefore seeing the alias assignments and not returning an error? – Inessaria Apr 29 '16 at 14:57
  • That's right. Forget any notion of SQL being run in the 'order that you type it'. i.e. just because you put a table first in the expression doesn't mean that it operates on that table first. It might do it last. – Nick.Mc May 02 '16 at 07:17