73

The effect of issuing an inner join is the same as stating a cross join with the join condition in the WHERE-clause. I noticed that many people in my company use cross joins, where I would use inner joins. I didn't notice any significant performance gain after changing some of these queries and was wondering if it was just a coincidence or if the DBMS optimizes such issues transparently (MySql in our case). And here a concrete example for discussion:

SELECT User.*
FROM User, Address
WHERE User.addressId = Address.id;

SELECT User.*
FROM User
INNER JOIN Address ON (User.addressId = Address.id);
Igor Suhotin
  • 41
  • 1
  • 2
  • 5
soulmerge
  • 73,842
  • 19
  • 118
  • 155
  • 7
    The first one isn't ANSI SQL. I know that Microsoft has recommended moving to the syntax of your second example. MySQL may do the same now that they are corporate. Also, the first one isn't a cross join. A cross join would be if you left out the where clause. – WakeUpScreaming Mar 22 '09 at 17:17
  • 6
    The one you are calling cross join is an older syntax. We dinosaurs :) tend to use it. The Inner join syntax was agreed a as part of SQL 92 (IIRC) standard. Since then vendors have moved towards support of it with most vendors supporting both. Therefore its not ANSI because its not in the docs. – Karl Mar 22 '09 at 20:42
  • 2
    @WakeUpScreaming +1 for pointing out that there are no cross joins shown here. – Leigh Riffel Nov 23 '10 at 16:44

11 Answers11

67

Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.

If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.

As long as your queries abide by common sense and vendor specific performance guidelines (i), I like to think of the decision on which type of join to use to be a simple matter of taste.

(i) Vendor Specific Performance Guidelines

  1. MySQL Performance Tuning and Optimization Resources
  2. PostgreSQL Performance Optimization
Joel
  • 184
  • 1
  • 10
karim79
  • 339,989
  • 67
  • 413
  • 406
28

There is no difference other than the inner join is a lot clearer because it defines the join, leaving the where clause to be the actual limiting condition.

Otávio Décio
  • 73,752
  • 17
  • 161
  • 228
19

Use EXPLAIN to view the query plan for both queries, and see if there's any difference. Quite possibly MySQL will use the same execution plan in both cases. I use the INNER JOIN syntax mainly because it's a lot clearer.

Emil H
  • 39,840
  • 10
  • 78
  • 97
12

I find that work-places that allow the first syntax (comma separated tables) tend to have significant time taken up debugging cases where more rows are returned than intended. Unintentional cross joins are the bane of a system, and can bring even the most well-tuned database to it's knees. It has brought our pre-prod system to a screeching halt on at least two occasions in the last year.

The second syntax (join syntax) forces the writer to think about how the tables are joined together first, and then only return the interesting rows. It is impossible to accidentally do a cross join using this syntax, and thus the danger of accidental poorly performing queries is reduced.

However, that issue aside, I have never noticed any speed difference between the two syntaxes in any systems I have had.

Tom H
  • 46,766
  • 14
  • 87
  • 128
Jonathan
  • 25,873
  • 13
  • 66
  • 85
3

Explaining both queries gives same output

mysql> explain select * from t T1, t T2 where T1.ID=T2.ID;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | T1    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | T2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

mysql> explain select * from t T1  join t T2 on T1.ID=T2.ID;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | T1    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | T2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

But using inner join syntax is preferable as its more clearer and more precise. Mysql may internally tune Left and Right join queries to select less data as compared to Cross Join.

minhas23
  • 9,291
  • 3
  • 58
  • 40
2

SQL Server said "When a WHERE turns a Cross Join into an Inner Join", so there are not difference. http://msdn.microsoft.com/en-us/library/ms190690.aspx

I did SQL server "Execution plan" the Performance is same.

nisiumi
  • 344
  • 3
  • 7
2

The first example is functionally the same as the second example. However, this syntax should be avoided for several reasons. First it is much easier to accidentally get a cross join when using this syntax especially when there are mulitple joins in the table. If you see a lot of this type of query with the keyword distinct, you probably have someone who is trying to fix the cross joins.

Next, the left and right join syntax using the older style is deprecated and will no longer be supported. Further, it doesn't work correctly now anyway. Sometimes it misinterprets the outer join and sends back the wrong results set. So any queries you have using = or = in the where clause should immediately be replaced.

Third, ANSI standard joins are easier to understand and maintain. An understanding of joins is one of the most critical basic skills that anyone querying any relational database needs to have. It has been my experience that some people who use the older style don't really understand joins and how they work and thus write queries that do not actually do what they intended.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
1

One additional benefit of the first syntax is you can be more general in your limiting condition. Not just equality.

But if you are using equality, why trust the optimizer? Make sure it won't first generate the cross join and then eliminate rows. Use the second one.

Ozgur Ozturk
  • 1,265
  • 11
  • 9
1

The order in which you join tables or you put your ON / WHERE conditions should not matter.

The query optimizer should optimize and use the best order anyway (and chosing how to best filter the data, where to start, etc)

As many others though, I suggest using the INNER JOIN syntax, since it makes things much more readable, it is more transparent with the syntax of LEFT or FULL joins as well.

There's a somewhat more dwelling text about it here: http://linus.brimstedt.se/?/article/articleview/SQL Syntax

/B

Brimstedt
  • 3,020
  • 22
  • 32
-1

Since the beginning of time optimizers have being built around classic restrict-project-cartesian product syntax. Virtually all the vendors copied the design pioneered by System R. Then, grudgingly, vendors adopted "the latest-and-greatest" ANSI syntax and retrofitted their SQL execution engines. Contrary to what marketing brochure can tell you ("use the latest syntax"), not much on physical implementation level has been changed: it is still [indexed] nested loops, or hash or sort-merge join. Therefore, there is no basis to assume superiority of one syntax over the other.

To my personal taste, the new syntax is redundant, noisy, and inconsistent. As to being sanctioned by the committee, "walk into any park in every city and you'll find no statue of committee".

Tegiri Nenashi
  • 3,066
  • 2
  • 19
  • 20
  • 2
    The use of implicit joins is a SQL antipattern. It is subject to accidental cross joins, it is harder to maintain especially if you need to change to outer joins at some point (you should not mix the two and the outer join implicit syntax at least in some datbases does not work correctly anyway). If you actually want a cross join there is no way to tell if that is true or if you suffer from the accidental cross join problem. So intent is not clear. This is just a poor practice. – HLGEM May 27 '14 at 17:06
-1

There is not much difference between cross join with where clause and inner join they are probably the same but with the use of inner join we would decrease some part in code.

Nihar
  • 1
  • 2