2

This MySQL update statement takes around 7 seconds and I'm wondering how to improve its performance. The overall query was very slow so I began breaking it down into smaller pieces (100 ids at a time):

UPDATE results
INNER JOIN urls
  ON urls.id = results.url_id
SET results.url = urls.name
WHERE results.url_id >= 100
  AND results.url_id < 200

Basically I'm building a de-normalized table of "results" which initially has NULL for results.url. I need to fill that in from the urls.name field, and it's taking too long. This query modifies around 25,000 records.

When I profile the query, here's the results I see:

starting                    0.000052
checking permissions        0.000004
checking permissions        0.000005
Opening tables              0.000012
checking permissions        0.000003
checking permissions        0.000004
System lock                 0.000004
Table lock                  0.000007
init                        0.000013
updating main table         0.000005
optimizing                  0.000015
statistics                  0.000114
preparing                   0.000030
executing                   0.000004
Sending data                0.239372
converting HEAP to MyISAM   0.165893
Sending data                0.239267
updating reference tables   6.585605
end                         0.000029
end                         0.000008
removing tmp table          0.007922
end                         0.000007
query end                   0.000005
freeing items               0.000286
logging slow query          0.000005
cleaning up                 0.000004

Since the query is spending most of its time "updating" (over 6.5 seconds), is there anything I can do here? I'm not clear whether this means mysql is spending time locating the rows to update, or whether its spending time just copying bits from one table to another.

If it takes 6.5 seconds to copy over values from one table to another, there's nothing I can do (I assume, since it doesn't look like anything is going to disk?) But if its spending time locating the rows to update, I thought that an index might help. I added an index on results.url_id, but it doesn't seem to be doing the trick.

Rob Crowell
  • 1,447
  • 3
  • 15
  • 25

2 Answers2

0

Is that update running frequenly?

If so - try to update only the values that are realy different.

If not - 7s for lets say once a day is quite fast.

Stephan
  • 41,764
  • 65
  • 238
  • 329
Admin
  • 1
0

Apply an index on urls.id and results.url_id, then try again.

Bjoern
  • 15,934
  • 4
  • 43
  • 48
  • Thanks. I did add an index on results.url_id, and urls.id is the primary key of the table. The profile above was after the index was added. – Rob Crowell May 12 '11 at 20:46
  • Are urls.id and results.url_id of the same datatype? If it ain't, there is the possibility that the database server is occupied converting the data to match the index. – Bjoern May 12 '11 at 20:48
  • @Bjoern Yep. results.url_id is int(11) NOT NULL and urls.id is int(11) NOT NULL AUTO_INCREMENT. I guess part of my question is what exactly the 'updating reference tables' means -- is it looking for rows, physically updating the rows, or both? – Rob Crowell May 12 '11 at 20:51
  • From http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html: "The server is executing the second part of a multiple-table update and updating the matched rows from the other tables." – Bjoern May 12 '11 at 20:52
  • Cool, yep. I was confused though because that same document also has this to say about the 'Updating' state: "The thread is searching for rows to update and is updating them." – Rob Crowell May 12 '11 at 21:09
  • I guess there isn't much tweaking query-wise. That said, 25000 records in about 7 secs with an inner join ain't too bad id your database machine ain't a high-end system! – Bjoern May 12 '11 at 21:18