0

I need to make this update query more efficient.

UPDATE #table_name# SET #column_name2# = 1 WHERE #column_name1# in (A list of data)

Right now it takes more than 2 minute to finish the job when my list of data is quite large. Here is the result of explain of this query:

+----+-------------+--------------+-------+---------------+---------+---------+------+--------+------------------------------+
| id | select_type | table        | type  | possible_keys | key     | key_len | ref  | rows   | Extra                        |
+----+-------------+--------------+-------+---------------+---------+---------+------+--------+------------------------------+
|  1 | SIMPLE      | #table_name# | index | NULL          | PRIMARY | 38      | NULL | 763719 | Using where; Using temporary |
+----+-------------+--------------+-------+---------------+---------+---------+------+--------+------------------------------+

In class, I was told that an OK query should at least have a type of range and is better to reach ref. Right now mine is index, which is the second slowest I think. I'm wondering if there's a way to optimize that.

Here is the table format:

+--------------------+-------------+------+-----+-------------------+-------+
| Field              | Type        | Null | Key | Default           | Extra |
+--------------------+-------------+------+-----+-------------------+-------+
| #column_name1#     | varchar(12) | NO   | PRI |                   |       |
| #column_name2#     | tinyint(4)  | NO   |     | 0                 |       |
| #column_name3#     | tinyint(4)  | NO   |     | 0                 |       |
| ENTRY_TIME         | datetime    | NO   |     | CURRENT_TIMESTAMP |       |
+--------------------+-------------+------+-----+-------------------+-------+

My friend suggested me that using exists rather than in clause may help. However, it looks like I cannot use exists like exists (A list of data)

Bing Gan
  • 415
  • 3
  • 6
  • 12
  • `Exists` wants a select statement. It is a bit different to `in`. But can be faster than `in`. But there is a difference in handling `null` values. for more information : http://stackoverflow.com/questions/14190788/subqueries-with-exists-vs-in-mysql – Nebi Sep 20 '16 at 10:59
  • Are you updating 763719 records? that's going to be slow no matter what – e4c5 Sep 20 '16 at 11:04
  • You say "when my list of data is quite large." How large? – O. Jones Sep 20 '16 at 11:17
  • @e4c5 I'm not updating that many. ~500 records are updated – Bing Gan Sep 20 '16 at 16:11
  • @OllieJones It's around 500 elements – Bing Gan Sep 20 '16 at 16:12

1 Answers1

0

For this query:

UPDATE #table_name#
    SET #column_name2# = 1
    WHERE #column_name1# in (A list of data);

You want an index on #table_name#(#column_name1#).

Do note that the number of records being updated has a very big impact on performance. If the "list of data" is really a subquery, then other methods are likely to be more helpful for improving performance.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Do I need to make it an index on a primary key? I think it's an index already – Bing Gan Sep 20 '16 at 16:04
  • @BingGan . . . If it is already *the* primary key, then an additional index is not necessary. If it is the first key of a compound primary key, then an additional index is not necessary. If it is a subsequent key of a primary key, then you do want the index. – Gordon Linoff Sep 20 '16 at 23:47