I have a photo gallery on my site with 1M photos in it. There are 2 search tables associated with it. Table #1 contains a list of words used in the photos. Table #2 contains a list of what words match up with what photos. Table #2 is 7M rows. I am testing partitioning this 7M row table because I have another set of tables with 120,000,000 rows. Queries against the 120M row wordmatch table below, with or without a join again the wordlist table below, take multiple seconds to run.
I am trying to perform a join between these 2 tables and MySQL 5.6 EXPLAIN PARTITIONS shows it is using all the partitions. How can I redo this query to make this correctly use only a single partition?
The 2 tables:
CREATE TABLE wordlist (
word_text varchar(50) NOT NULL DEFAULT '',
word_id mediumint(8) unsigned NOT NULL AUTO_INCREMENT
PRIMARY KEY (word_text),
KEY word_id (word_id)
) ENGINE=InnoDB
CREATE TABLE wordmatch (
pic_id int(11) unsigned NOT NULL DEFAULT '0',
word_id mediumint(8) unsigned NOT NULL DEFAULT '0',
title_match tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (word_id,pic_id,title_match),
KEY pic_id (pic_id)
) ENGINE=InnoDB
/*!50100 PARTITION BY HASH (word_id)
PARTITIONS 11 */;
SQL query I am performing:
EXPLAIN PARTITIONS SELECT m.pic_id FROM wordlist w, wordmatch m WHERE w.word_text LIKE 'bacon' AND m.word_id = w.word_id
+----+-------------+-------+-----------------------------------+-------+-----------------+---------+---------+----------------------------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-----------------------------------+-------+-----------------+---------+---------+----------------------------+------+-------------+
| 1 | SIMPLE | w | NULL | range | PRIMARY,word_id | PRIMARY | 52 | NULL | 1 | Using where |
| 1 | SIMPLE | m | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10 | ref | PRIMARY | PRIMARY | 3 | w.word_id | 34 | Using index |
+----+-------------+-------+-----------------------------------+-------+-----------------+---------+---------+----------------------------+------+-------------+
The join produces a query that uses all partitions. If I retrieve the word_id # first and go straight against the wordmatch table, everything is ok:
EXPLAIN PARTITIONS SELECT m.pic_id FROM wordmatch m WHERE m.word_id = 219657;
+----+-------------+-------+------------+------+---------------+---------+---------+-------+-------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+-------+-------------+
| 1 | SIMPLE | m | p9 | ref | PRIMARY | PRIMARY | 3 | const | 18220 | Using index |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+-------+-------------+
How do I get this to work correctly? I prefer not to split this into multiple queries if possible. You may have noticed I am using LIKE above. People will often search on bacon% to get plurals of words, etc. Example:
SELECT m.pic_id FROM wordlist w, wordmatch m WHERE w.word_text LIKE 'bacon%' AND m.word_id = w.word_id
I realize this wildcard search may result in 2 or more partitions being selected. This is probably ok, although if there is a way to change the partitioning to prevent that, I welcome any tips.
Edit #1: Added details as my original question was confusing. I was testing my 7M row table first before doing my 120M row table.
Edit #2: Resolution to my overall issue: My performance issues seem to be resolved as I partitioned my 120M row table into 101 partitions per this post: MySQL performance: partitions I do not know if MySQL is going against all the partitions at runtime - Ollie Jones says it does not in the comments below and EXPLAIN PARTITIONS is incorrect - but it is fast now so I am happy.