I have recently switched my project tables to InnoDB (thinking the relations would be a nice thing to have). I'm using a PHP script to index about 500 products at a time.
A table storing word/ids association:
CREATE TABLE `windex` (
`word` varchar(64) NOT NULL,
`wid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`count` int(11) unsigned NOT NULL DEFAULT '1',
PRIMARY KEY (`wid`),
UNIQUE KEY `word` (`word`)
) ENGINE=InnoDB AUTO_INCREMENT=324551 DEFAULT CHARSET=latin1
Another table stores product id/word id associations:
CREATE TABLE `indx_0` (
`wid` int(7) unsigned NOT NULL,
`pid` int(7) unsigned NOT NULL,
UNIQUE KEY `wid` (`wid`,`pid`),
KEY `pid` (`pid`),
CONSTRAINT `indx_0_ibfk_1` FOREIGN KEY (`wid`) REFERENCES `windex` (`wid`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `indx_0_ibfk_2` FOREIGN KEY (`pid`) REFERENCES `product` (`ID`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The script was tested using MyISAM and it indexes products relatively fast (much, much faster than InnoDB). First time running in InnoDB it was ridiculously slow but after nesting more values together I ended up speeding it up by a lot (but not enough).
I would assume innodb would be much faster for this type of thing because of rowlevel locks but that's not the case.
I construct a query that looks something like:
SELECT
title,keywords,upc,...
FROM product
WHERE indexed = 0
LIMIT 500
I create a loop and fill an array with all the words that need to be added to windex and all the word id/product id pairs that need to be added to indx_0.
Because innodb keeps increasing my auto-increment values whenever i do a "REPLACE INTO" or "INSERT IGNORE INTO" that fails because of duplicate values, I need to make sure the values I add don't already exist. To do that I first select all values that exist using a query like such:
SELECT wid,word
FROM windex
WHERE
word = "someword1" or word = "someword2" or word = "someword3" ... ...
Then I filter out my array against the results which exist so all the new words I add are 100% new.
This takes about 20% of overall execution time. The other 80% goes into adding the pair values into indx_0, for which there are many more values.
Here's an example of what I get.
0.4806 seconds to select products. (0.4807 sec total).
0.0319 seconds to gather 500 items. (0.5126 sec total).
5.2396 seconds to select windex values for comparison. (5.7836 sec total).
1.8986 seconds to update count. (7.6822 sec total).
0.0641 seconds to add 832 windex records. (7.7464 sec total).
17.2725 seconds to add index of 3435 pid/wid pairs. (25.7752 sec total).
Operation took 26.07 seconds to index 500 products.
The 3435 pairs are being all executed in a single query such as:
INSERT INTO indx_0(pid,wid)
VALUES (1,4),(3,9),(9,2)... ... ...
Why is InnoDB so much slower than MyISAM in my case?