Why use a JOIN clause versus a WHERE condition?

Question

I develop against Oracle databases. When I need to manually write (not use an ORM like hibernate), I use a WHERE condition instead of a JOIN.

for example (this is simplistic just to illustrate the style):

Select *
from customers c, invoices i, shipment_info si
where c.customer_id = i.customer_id
    and i.amount > 999.99 
    and i.invoice_id = si.invoice_id(+)  -- added to show a replacement for a join
order by i.amount, c.name

I learned this style from an OLD oracle DBA. I have since learned that this is not standard SQL syntax. Other than being non-standard and much less database portable, are there any other repercussions to using this format?

score 16 · Answer 1 · answered Oct 23 '09 at 13:19

16

I don't like the style because it makes it harder to determine which WHERE clauses are for simulating JOINs and which ones are for actual filters, and I don't like code that makes it unnecessarily difficult to determine the original intent of the programmer.

answered Oct 23 '09 at 13:19

Hank Gay

70,339
36
160
222

3

+1 for "I don't like code that makes it unnecessarily difficult to determine the original intent of the programmer" – Adam Paynter Oct 23 '09 at 13:56
2

+1 IMHO It's the reason why we should use the JOINs into FROM clause. It's clearer when join are separated from the filter conditions. – Luc M Dec 16 '10 at 21:19

score 10 · Answer 2 · answered Oct 23 '09 at 13:14

10

The biggest issue that I have run into with this format is the tendency to forget some join's WHERE clause, thereby resulting in a cartesian product. This is particularly common (for me, at least) when adding a new table to the query. For example, suppose an ADDRESSES table is thrown into the mix and your mind is a bit forgetful:

SELECT *
  FROM customers c, invoices i, addresses a
 WHERE c.customer_id = i.customer_id
   AND i.amount > 999.99
 ORDER BY i.amount, c.name

Boom! Cartesian product! :)

answered Oct 23 '09 at 13:14

Adam Paynter

46,244
33
149
164

2

This is true, but like all (err most) syntax errors, won't you see the problem when you try to run, and then fix it? That has been my experience in this matter. – Jay Oct 23 '09 at 13:21
1

@Jay: True enough. I still find it annoying when you get a big, runaway query started on low-resource Oracle servers. :) – Adam Paynter Oct 23 '09 at 13:28
5

Jay, sadly no most people won't see the problem, they just add a distinct and go on their merry way. – HLGEM Oct 23 '09 at 13:33
1

Cartesian products are also allowed using the JOIN syntax by leaving out the JOIN condition. This is a bit more obvious in the JOIN syntax but still quite possible. And, @HLGEM, adding DISTINCT to the SELECT will not address the issue of a Cartesian production especially in the common case where a primary key is returned from each participating table. – Larry Lustig Oct 23 '09 at 13:51
@Larry: Interesting, I didn't even know that you could leave out the join condition! – Adam Paynter Oct 23 '09 at 13:56
@Adam I *think* the standard behavior when a `JOIN` condition is omitted is a natural join where it does an implicit `=` for each pair of columns with matching names, but it probably varies between RDBMS. – Hank Gay Oct 23 '09 at 13:59
If you specify the join type, i.e. `INNER JOIN`, most DBMSes force you to include an ON or USING statement. – Powerlord Oct 23 '09 at 17:57

score 7 · Answer 3 · edited Feb 07 '14 at 11:45

7

The old style join is flat out wrong in some cases (outer joins are the culprit). Although they are more or less equivalent when using inner joins, they can generate incorrect results with outer joins, especially if columns on the outer side can be null. This is because when using the older syntax the join conditions are not logically evaluated until the entire result set has been constructed, it is simply not possible to express a condition on a column from outer side of a join that will filter records when the column can be null because there is no matching record.

As an example:

Select all Customers, and the sum of the sales of Widgets on all their Invoices in the month Of August, where the Invoice has been processed (Invoice.ProcessDate is Not Null)

using new ANSI-92 Join syntax

 Select c.name, Sum(d.Amount)
 From customer c
    Left Join Invoice I 
        On i.custId = c.custId
            And i.SalesDate Between '8/1/2009' 
                      and '8/31/2009 23:59:59'
            And i.ProcessDate Is Not Null
    Left Join InvoiceDetails d 
        On d.InvoiceId = i.InvoiceId
            And d.Product = 'widget'
 Group By c.Name

Try doing this with old syntax... Because when using the old style syntax, all the conditions in the where clause are evaluated/applied BEFORE the 'outer' rows are added back in, All the UnProcessed Invoice rows will get added back into the final result set... So this is not possible with old syntax - anything that attempts to filter out the invoices with null Processed Dates will eliminate customers... the only alternative is to use a correlated subquery.

edited Feb 07 '14 at 11:45

Beryllium

12,808
10
56
86

answered Oct 23 '09 at 13:26

Charles Bretana

143,358
22
150
216

1

The older syntax does not evaluate OUTER JOINs incorrectly. It simply is not capable of expressing the idea of an OUTER JOIN. The lack of a feature is not the same thing as implementing the feature incorrectly. – Larry Lustig Oct 23 '09 at 13:32
2

Both Oracle and SQL Server had syntax for expressing outer joins, Oracle used '"+="' or '"=+"' and SQL Server used to use '"*="' or '"=*"' to represent left, and right outer joins, respectively... – Charles Bretana Oct 23 '09 at 14:04
1

@Charles, Could you give an example where the old join style is plain wrong? – tuinstoel Oct 23 '09 at 14:46
@Charles: there is nothing "wrong" with the syntax in the asker's original code sample - it does what it does. Also note that the older ways of doing outer joins in Oracle and SQL Server are *different,* which means *not portable,* which was the point of *my* answer (I'm assuming you were the down-voter). – MusiGenesis Oct 23 '09 at 15:46
1

No there are several cases where they are just wrong. I will add examples to my answer... – Charles Bretana Oct 23 '09 at 16:48
1

from SQL Server BOL(2000) "In earlier versions of Microsoft® SQL Server™ 2000, left and right outer join conditions were specified in the WHERE clause using the *= and =* operators. In some cases, this syntax results in an ambiguous query that can be interpreted in more than one way." They say themselves that outer joins specified this way don't work properly. And the old outer join syntax has been deprecated. Why are we still arguing about using a syntax that is 17 years out of date? – HLGEM Oct 23 '09 at 17:08
1

because too many folks are still using it out of habit and a false belief that they are functionally equivilent. – Charles Bretana Oct 23 '09 at 17:49

score 6 · Accepted Answer · answered Oct 23 '09 at 13:11

6

Some people will say that this style is less readable, but that's a matter of habit. From a performance point of view, it doesn't matter, since the query optimizer takes care of that.

answered Oct 23 '09 at 13:11

Erich Kitzmueller

36,381
5
80
102

4

Except that for certian outer join queries the older syntax cab easily give incorrect results, and therefore should be avoided. – Charles Bretana Oct 23 '09 at 13:29
Charles: Agreed, outer joins using the (+) syntax are a PITA. Hard to comprehend, error-prone... – Erich Kitzmueller Oct 23 '09 at 14:07
On the other hand the ANSI syntax has been more buggy and also has given incorrect results. I don't recall a situation where the oracle join synatx did so though ... do you have an example? – David Aldridge Oct 23 '09 at 14:08
3

I find it to be the exact opposite. It is more readable having the query split into columns, tables, joins and other limiting factors. I find the (+) syntax to be much easier to work with than the ANSI. Personal preference I guess. – Doug Porter Oct 23 '09 at 14:42
1

The (+) syntax is used by companies other than Oracle? – Powerlord Oct 23 '09 at 18:07

Joel Coehoorn · Answer 5 · 2009-10-23T18:06:40.287

4

I have since learned that this is not standard SQL syntax.

That's not quite true. The "a,b where" syntax is from the ansi-89 standard, the "a join b on" syntax is ansi-92. However, the 89 syntax is deprecated, which means you should not use it for new queries.

Also, there are some situations where the older style lacks expressive power, especially with regard to outer joins or complex queries.

It can be a pain going through the where clause trying to pick out join conditions. For anything more than one join the old style is absolute evil. And once you know the new style, you may as well just keep using it.

edited Oct 23 '09 at 18:06

answered Oct 23 '09 at 13:28

Joel Coehoorn

399,467
113
570
794

1

Totally agree--especially when your join clauses are more complex than simple equalities. – Philip Kelley Oct 23 '09 at 14:27

score 3 · Answer 6 · answered Oct 23 '09 at 13:24

This is a standard SQL syntax, just an older standard than JOIN. There's a reason that the syntax has evolved and you should use the newer JOIN syntax because:

It's more expressive, clearly indicating which tables are JOINed, the JOIN order, which conditions apply to which JOIN, and separating out the filtering WHERE conditions from the JOIN conditions.
It supports LEFT, RIGHT, and FULL OUTER JOINs, which the WHERE syntax does not.

I don't think you'll find the WHERE-type JOIN substantially less portable than the JOIN syntax.

score 3 · Answer 7 · answered Oct 23 '09 at 13:25

3

As long as you don't use the ANSI natural join feature I'm OK with it.

I found this quote by – ScottCher, I totally agree:

I find the WHERE syntax easier to read than INNER JOIN - I guess its like Vegemite. Most people in the world probably find it disgusting but kids brought up eating it love it.

answered Oct 23 '09 at 13:25

Rob van Laarhoven

8,737
2
31
49

1

+1 on the vegemite comment :-) Although I don't agree with your technical point: what's your problem with natural joins? – Arthur Reutenauer Oct 23 '09 at 13:32
3

You should be specific about the columns you join. Natural joins are error prone. For example (from ask tom site) Consider a model like this: create table emp ( empno, deptno, ename ); create table dept ( deptno, dname, empno /* person responsible for this dept */ ); now, generate a list of enames and the dname they work in using a natural join. – Rob van Laarhoven Oct 23 '09 at 13:50
OK, my bad. I naively imagined that natural joins were based on foreign keys, and just added those joins, not all the ones you could guess by column names. The behaviour you describe is indeed bad. – Arthur Reutenauer Oct 23 '09 at 15:23
Even if it uses FK's it does not know which to choose. I have a fk from emp.deptno to dept.deptno and one from dept.empno to emp.empno. Think about what happens when you add or rename columns. Yuck – Rob van Laarhoven Oct 23 '09 at 16:03

ercan · Answer 8 · 2009-10-23T14:40:57.347

It really depends on habits, but I have always found Oracle's comma separated syntax more natural. The first reason is that I think using (INNER) JOIN diminishes readability. The second is about flexibility. In the end, a join is a cartesian product by definition. You do not necessarily have to restrict the results based on IDs of both tables. Although very seldom, one might well need cartesian product of two tables. Restricting them based on IDs is just a very reasonable practice, but NOT A RULE. However, if you use JOIN keyword in e.g. SQL Server, it won't let you omit the ON keyword. Suppose you want to create a combination list. You have to do like this:

SELECT *
FROM numbers
JOIN letters
ON 1=1

Apart from that, I find the (+) syntax of Oracle also very reasonable. It is a nice way to say, "Add this record to the resultset too, even if it is null." It is way better than the RIGHT/LEFT JOIN syntax, because in fact there is no left or right! When you want to join 10 tables with several different types of outer joins, it gets confusing which table is on the "left hand side" and which one on the right.

By the way, as a more general comment, I don't think SQL portability exists in the practical world any more. The standard SQL is so poor and the expressiveness of diverse DBMS specific syntax are so often demanded, I don't think 100% portable SQL code is an achievable goal. The most obvious evidence of my observation is the good old row number problemmatic. Just search any forum for "sql row number", including SO, and you will see hundreds of posts asking how it can be achieved in a specific DBMS. Similar and related to that, so is limiting the number of returned rows, for example..

SQL Server lets you do CROSS JOIN statements. These do not require a following ON clause. — Philip Kelley, Oct 23 '09 at 14:24
-1: so many errors. Joins increase readability by separating join conditions from resultset predicates. Joins allow for all varieties, including Cross Joins where no On clause is required. Oracle outer join syntax DOES have a left/right, the (+) syntax requires a += or a =+, which is equally difficult to interpret as to "which side" the nulls can be on. The Left Right is not difficult at all, the Left is the stuff you already have the Right is the new table you are Join-(add)ing to the resultset, just as you read, from Left to Right. And the old += syntax is just plain wrong in some cases. — Charles Bretana, Nov 23 '09 at 14:52

score 0 · Answer 9 · answered Oct 23 '09 at 13:13

This is Transact SQL syntax, and I'm not quite sure how "unportable" it is - it is the main syntax used in Sybase, for example (Sybase supports ANSI syntax as well) as well as many other databases (if not all).

The main benefits to ANSI syntax is that it allows you to write some fairly tricky chained joins that T-SQL prohibits

score 0 · Answer 10 · answered Feb 28 '22 at 23:44

Speaking as someone who writes automated sql query transformers (inline view expansions, grafted joins, union factoring) and thinks of SQL as a data structure to manipulate: the non-JOIN syntax is far less pain to manipulate. I can't speak to "harder to read" complaints; JOIN looks like an lunge toward relational algebra operators. Don't go there :-)

score -2 · Answer 11 · answered Oct 23 '09 at 13:12

-2

Actually, this syntax is more portable than a JOIN, because it will work with pretty much any database, whereas not everybody supports the JOIN syntax (Oracle Lite doesn't, for example [unless this has changed recently]).

answered Oct 23 '09 at 13:12

MusiGenesis

74,184
40
190
334

3

The older join syntax is wrong in numerous cases where *outer* join logic is required. – Charles Bretana Oct 23 '09 at 13:27

Why use a JOIN clause versus a WHERE condition?

11 Answers11

Linked

Related