5

Yesterday I found a slow query running on the server(this query costs more than 1 minute).It looks like this:

select a.* from a 
left join b on a.hotel_id=b.hotel_id and a.hotel_type=b.hotel_type
where b.hotel_id is null

There are 40000+ rows in table a and 10000+ rows in table b.An unique key had already been created on columns hotel_id and hotel_type in table b like UNIQUE KEY idx_hotel_id (hotel_id,hotel_type).So I used the explain keyword to check the query plan on this sql and I got a result like the following:

                type            key                             rows
1   SIMPLE  a   ALL     NULL    NULL            NULL    NULL    36804   
1   SIMPLE  b   index   NULL    idx_hotel_id    185     NULL    8353    Using where; Using index; Not exists

According to the reference manual of MySQL, when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE NOT NULL index the join type will be "eq_ref".See the second row of the query plan,the value of column type is "index".But I really had en unique index on hotel_id and hotel_type and both the two columns were used by the join.The join type "ef_ref" is more efficient than the join type "ref" and "ref" is more efficient than "range"."index" is the last join type wo wanna hava except "ALL".This is what I'm confused about and I wanna know why the join type here is "index". I hope I describe my question clear and I'm looking forward to get answers from you guys,thanks!

Damodaran
  • 10,882
  • 10
  • 60
  • 81
luxury_zh
  • 51
  • 4
  • 2
    change your select a.* to select a.hotel_id, a.hotel_type and so on. select * not only return the data but more than that. – errorare Nov 14 '13 at 07:22
  • @errorare What gets returned shouldn't affect how the join is done. – Barmar Nov 14 '13 at 07:26
  • Refer here [link](http://stackoverflow.com/a/3639964/1572987) – errorare Nov 14 '13 at 07:30
  • Why are you looking for b.hotel_id = NULL? You are going to get a join of all rows with NULL hotel_id values, if any exist. Also, if the only index you have on b.hotel_id is a UNIQUE index, that index is useless for NULL values. – elixenide Nov 14 '13 at 07:32
  • 1
    "An unique key had already been created on columns hotel_id and hotel_type in table b ". What about table a? It should have index as well – Maxim Krizhanovsky Nov 14 '13 at 07:32
  • @EdCottrell I use the condition b.hotel_id is NULL because I wanna find the records in table a but not exist in table b. – luxury_zh Nov 14 '13 at 11:58
  • For Me, Removing Select * Actually worked (to my complete surprise) – Adam Fowler Feb 26 '16 at 00:05

2 Answers2

1

Where Is Null checks can be slow, so maybe it is that.

select * from a 
where not exists ( select 1 from b where a.hotel_id=b.hotel_id and a.hotel_type=b.hotel_type )

Also: how many records are you returning? If you are returning all 36804 records this could slow things down as well.

Sam
  • 1,358
  • 15
  • 24
  • about 2000+ rows should be returned. – luxury_zh Nov 14 '13 at 12:01
  • 1
    @luxury_zh It could be that that's what slowing it down, especially if 'a' contains many columns. You could try specifiying the columns you need (see errorare's first comment on your question) and see if that helps. You you could try limiting the amount of records that are returned as well. – Sam Nov 14 '13 at 12:53
0

Thanks all the people above!I found the way to solve my problem myself.The columns hotel_id and hotel_type didn't have the same character set.After I made them both "utf8",my query returned result in about less than 10 millisecond.There is an good article about left join and index in MySQL,I strongly recommend it to you guys.Here is the site:http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/

luxury_zh
  • 51
  • 4