63

I have the following MySql query:

select t1.*
from Table1 t1
inner join Table2 t2
on t1.CommonID = t2.CommonID
where t1.FilterID = 1

It takes about 30 seconds to run, which was strange, because if I comment out the join or the where clause it takes less than a second: i.e.

select t1.*
from Table1 t1
where t1.FilterID = 1

or

select t1.*
from Table1 t1
inner join Table2 t2
on t1.CommonID = t2.CommonID

each take less than a second.

Then there is the STRAIGHT_JOIN keyword, which I can find one reference of, here: http://dev.mysql.com/doc/refman/5.0/en/join.html

STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table. This can be used for those (few) cases for which the join optimizer puts the tables in the wrong order.

What? I can write:

select t1.*
from Table1 t1
STRAIGHT_JOIN  Table2 t2
on t1.CommonID = t2.CommonID
where t1.FilterID = 1

and the query executes in less than a second.

Even stranger, I can write:

select STRAIGHT_JOIN  t1.*
from Table1 t1
inner join Table2 t2
on t1.CommonID = t2.CommonID
where t1.FilterID = 1

and it takes less than a second, and this syntax does not appear to even be legal.

I would guess the second example means that a STRAIGHT_JOIN will be used whenever an INNER JOIN is written, but I can’t find any documentation about it.

What is going on here, and how can the “join optimizer” result in such relatively poor performance? Should I always use STRAIGHT_JOIN? How can I tell when to use it or not?

Table1 and Table2 both have integer primary keys; FilterID is a foreign key to another table; the CommonID columns are both foreign keys to a third table. They both have indexes on them. The database engine is InnoDB.

Thanks

zod
  • 2,688
  • 5
  • 27
  • 49

1 Answers1

48

What is going on here, and how can the “join optimizer” result in such relatively poor performance?

STRAIGHT_JOIN forces the join order of the tables, so table1 is scanned in the outer loop and table2 in the inner loop.

The optimizer is not perfect (though stil quite decent), and the most probable cause is the outdated statistics.

Should I always use STRAIGHT_JOIN

No, only when the optimizer is wrong. This may be if your data distribution is severely skewed or cannot be calculated properly (say, for spatial or fulltext indexes).

How can I tell when to use it or not?

You should collect the statistics, build the plans for both ways and understand what do these plans mean.

If you see that:

  1. The automatically generated plan is not optimal and cannot be improved by the standard ways,

  2. The STRAIGHT_JOIN version is better, you understand it always will and understand why it always will

, then use STRAIGHT_JOIN.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • 2
    Thanks. any idea what the syntax select STRAIGHT_JOIN t1.* ... does? – zod Apr 28 '11 at 13:21
  • 3
    @zod: you can write a single `STRAIGHT_JOIN` this way instead of individual `STRAIGHT_JOIN` between each pair of tables. It makes the tables to join in the order they are written in the query. – Quassnoi Apr 28 '11 at 13:28
  • @Quassnoi, do you mean that having `select STRAIGHT_JOIN` at the start of the query would make straight joins default for all the (possibly more than one) joins in the query? – Pacerier Nov 22 '13 at 22:05
  • @Quassnoi, Or does it only affect the first join in a statement like `select straight_join*from t1 join t2 join t3`? – Pacerier Mar 13 '15 at 11:34
  • 2
    @Pacerier: from the docs: *`STRAIGHT_JOIN` forces the optimizer to join the tables in the order in which they are listed in the `FROM` clause. You can use this to speed up a query if the optimizer joins the tables in nonoptimal order. `STRAIGHT_JOIN` also can be used in the `table_references` list.* http://dev.mysql.com/doc/refman/5.6/en/select.html – Quassnoi Mar 13 '15 at 12:31
  • @Quassnoi, No I mean when we have `t1 join t2 join t3 join t4`, does the straight join only affect `t1 join t2`, or does it also affect the other two joins `t2 join t3` and `t3 join t4`? – Pacerier Mar 13 '15 at 15:20
  • @Pacerier: if you specify `STRAIGHT_JOIN` after `SELECT`, all tables are joined in the order they are listed in the query. – Quassnoi Mar 13 '15 at 15:32
  • @Quassnoi, Is that from experience? Because I do not see anywhere in the docs stating such. – Pacerier Mar 13 '15 at 17:51
  • @Pacerier: I just cited an excerpt from the docs, doesn't it say that very thing? – Quassnoi Mar 13 '15 at 18:54
  • @Quassnoi, I don't think *"in the order in which they are listed in the `FROM` clause"* is indicative that the effect affects `t1 join t2 join t3 join t4` and not just the first encountered join `t1 join t2`. – Pacerier Mar 14 '15 at 08:51