0

can you please advise why such a query would take so long (literally 20-30 minutes)? I seem to have proper indexes set up, don't I?

UPDATE  `temp_val_import_435` t1,
`attr_upc` t2 SET t1.`attr_id` = t2.`id` WHERE t1.`value` LIKE t2.`upc`


CREATE TABLE `attr_upc` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `upc` varchar(255) NOT NULL,
 `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
 UNIQUE KEY `upc` (`upc`),
 KEY `last_update` (`last_update`)
) ENGINE=InnoDB AUTO_INCREMENT=102739 DEFAULT CHARSET=utf8


CREATE TABLE `temp_val_import_435` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `attr_id` int(11) DEFAULT NULL,
 `translation_id` int(11) DEFAULT NULL,
 `source_value` varchar(255) NOT NULL,
 `value` varchar(255) DEFAULT NULL,
 `count` int(11) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `core_value_id` (`core_value_id`),
 KEY `translation_id` (`translation_id`),
 KEY `source_value` (`source_value`),
 KEY `value` (`value`),
 KEY `count` (`count`)
) ENGINE=InnoDB AUTO_INCREMENT=32768 DEFAULT CHARSET=utf8

Ed Cottrell's solution worked for me. Using = instead of LIKE sped up a smaller test query on 1000 rows by a lot.

I measured 2 ways: 1 in phpMyAdmin, the other looking at the time for DOM load (which of course involves other processes).

DOM load went from 44 seconds to 1 second, a 98% increase.

But the difference in query execution time was much more dramatic, going from 43.4 seconds to 0.0052 seconds, a decrease of 99.988%. Pretty good. I will report back on results from huge datasets.

Buttle Butkus
  • 9,206
  • 13
  • 79
  • 120

2 Answers2

2

Use = instead of LIKE. = should be much faster than LIKE -- LIKE is only for matching patterns, as in '%something%', which matches anything with "something" anywhere in the text.

If you have this query:

SELECT * FROM myTable where myColumn LIKE 'blah'

MySQL can optimize this by pretending you typed myColumn = 'blah', because it sees that the pattern is fixed and has no wildcards. But what if you have this data in your upc column:

blah
foo
bar
%foo%
%bar
etc.

MySQL can't optimize your query in advance, because it's possible that the text it is trying to match is a pattern, like %foo%. So, it has to perform a full text search for LIKE matches on every single value of temp_val_import_435.value against every single value of attr_upc.upc. With a simple = and the indexes you have defined, this is unnecessary, and the query should be dramatically faster.

elixenide
  • 44,308
  • 16
  • 74
  • 100
  • Thanks for the interesting information. I would think the query optimizer could do a quick `SELECT COUNT(*) FROM table WHERE myColumn LIKE '\%'` (or like `_`) to find out if it could optimize the end query with an `=` sign. But I guess it doesn't. I'll give this a try soon. – Buttle Butkus Nov 11 '13 at 07:38
  • I tested it on a smaller data set with about 1000 rows. It took about 1 second with `=`, vs' about 44 seconds with `LIKE`. Thanks! – Buttle Butkus Nov 11 '13 at 23:50
0

In essence you are joining on a LIKE which is going to be problematic (would need EXPLAIN to see is MySQL if utilizing indexes at all). Try this:

UPDATE  `temp_val_import_435` t1
INNER JOIN `attr_upc` t2
  ON t1.`value` LIKE t2.`upc`
SET t1.`attr_id` = t2.`id` WHERE t1.`value` LIKE t2.`upc`
Mike Brant
  • 70,514
  • 10
  • 99
  • 103
  • Hi thanks for your suggestion. My version of MySQL does not support `EXPLAIN` on anything but `SELECT` statements, but I may upgrade soon. I do get the sense, though, that some sort of cartesian result is being processed. It should not take 17 minutes to update 30k records. – Buttle Butkus Nov 11 '13 at 07:41
  • I think that MySQL may perform the same kind of join internally with my syntax as you use explicitly in yours. Either way, the queries take about 44 seconds with `LIKE` comparisons on fields, and about 1 second with `=`. – Buttle Butkus Nov 11 '13 at 23:51