1

I am trying to select a number of items from table1 that are not in table2. table1 has approximately 3 million rows, and table2 has approximately 8,000. I already have an index on table1 for (expire,unreliable,block,deleted,flag,expirationDate). How can I optimize this:

SELECT table1.*
FROM table1
LEFT JOIN table2
ON table1.item_ID = table2.item_ID
WHERE table1.expire = '0'
AND table1.unreliable = '0'
AND table1.block = '0'
AND table1.deleted = '0'
AND table1.flag = '0'
AND table1.expirationDate >= CURDATE() 
AND table2.item_ID IS NULL
GROUP BY item_ID
LIMIT 5000
user2694306
  • 3,832
  • 10
  • 47
  • 95

1 Answers1

2

Add an additional index table2.item_id. This indexes is important to optimize the LEFT JOIN.

Imagine you have two different phonebooks. You are searching for all People starting with "T" (the WHERE part) in book A. This is quite fast because you have an index. But for every phone number you have to scan through the full book B trying to find the same phone number there. Using an index (a sorted list of all phone numbers in book B) this is much faster...

Daniel Alder
  • 5,031
  • 2
  • 45
  • 55
  • But shouldn't those indexes exists as item_ID is a primary key in both tables? – user2694306 Oct 09 '15 at 12:57
  • You didn't say `item_ID` is you primary key. I'm not sure if mysql automatically creates an index for primary keys. I believe to remember that that's not the case. You can simply check this using `SHOW INDEX`. If it doesn't show such an index, it doesn't exist. – Daniel Alder Oct 09 '15 at 13:00
  • 1
    Ahh, you were right, it wasn't the primary key in the second table. Now it seems to work well. Thanks! – user2694306 Oct 09 '15 at 13:02
  • 1
    This explains a lot. And I found http://stackoverflow.com/questions/1071180/is-the-primary-key-automatically-indexed-in-mysql which explains that PK always have an index – Daniel Alder Oct 09 '15 at 13:03
  • MySQL will normally only use one index on a table. So an index on table1.item_id will likely not be of benefit in this query (the index on the columns to check is likely more useful). – Kickstart Oct 09 '15 at 15:24
  • @Kickstart True. I'm used to program oracle databases, and there the optimizer might create a different and more efficient algorithm using a table1.item_id index in some cases. – Daniel Alder Oct 09 '15 at 16:02