2

I am working with a MySQL table and I need to increment a value in one column for each row, of which there are over 6.5m.

The col type is varchar and can contain an integer or a string (i.e. +1). The table type is MyISAM.

I have attempted this with PHP:

  $adjust_by = 1;
  foreach ($options as $option) {
      $original_turnaround = $option['turnaround'];
      $adjusted_turnaround = $option['turnaround'];

      if (preg_match('/\+/i', $original_turnaround)) {
        $tmp = intval($original_turnaround);
        $tmp += $adjust_by;
        $adjusted_turnaround = '+'.$tmp;
      } else {
        $adjusted_turnaround += $adjust_by;
      }

      if (!array_key_exists($option['optionid'], $adjusted)) {
        $adjusted[$option['optionid']] = array();
      }

      $adjusted[$option['optionid']][] = array(
        'original_turn' => $original_turnaround,
        'adjusted_turn' => $adjusted_turnaround
      );
  }//end fe options

  //update turnarounds:
  if (!empty($adjusted)) {
    foreach ($adjusted as $opt_id => $turnarounds) {
      foreach ($turnarounds as $turn) {
        $update = "UPDATE options SET turnaround = '".$turn['adjusted_turn']."' WHERE optionid = '".$opt_id."' and turnaround = '".$turn['original_turn']."'";
        run_query($update);
      }
    }
  }

For obvious reasons there are serious performance issues with this approach. Running this in my local dev environment leads to numerous errors and eventually the server crashing.

Another thing I need to consider is when this is run in a production environment. This is for an ecommerce store, and I cannot have a huge update like this lock the database or cause any other issues.

One possible solution I have found is this: Fastest way to update 120 Million records

But creating another table comes with it's own issues. The codebase is not in a good state, similar queries are run on this table in loads of places so I would have to modify a large number of queries and files to make this approach work.

What are my options (if there are any)?

MP_Webby
  • 916
  • 1
  • 11
  • 35
  • "Another thing I need to consider is when this is run in a production environment. This is for an ecommerce store, and I cannot have a huge update like this lock the database or cause any other issues." MyISAM engine will lock the table when updating or inserting... you should consider the convert the table to InnoDB engine.. this engine doesn't lock the table with updates or inserts. – Raymond Nijland Sep 06 '17 at 11:38
  • Is the table indexed properly? Maybe simple indexing will speed this up enough. – sagi Sep 06 '17 at 11:41
  • Do not run the intensive task with PHP. You may be stuck with maximum timeout issues and also PHP is not designed for CPU intensive task. You can rather use python for the task. – user254153 Sep 06 '17 at 11:43
  • @RaymondNijland Yes you are absolutely right, however, will changing the table type have any kind of impact on other queries where this table is joined with another? – MP_Webby Sep 06 '17 at 11:45
  • 1
    @user254153 did you just suggest switching to Python because you *think* it's "better " for CPU intensive tasks compared to PHP? Please, update your knowledge base, what you're suggesting is bad, incorrect and based on your personal preferences and not facts. Also, this is not CPU intensive task. – Mjh Sep 06 '17 at 12:00

3 Answers3

6

You can do this task with SQL.

  • With CAST you can convert a string into integer.
  • With IF and SUBSTR you can check if string contains +.
  • With CONCAT you will add (merge a two values into one string) + to your calculated result (if it will be necessary).

Just try this SQL:

"UPDATE `options` SET `turnaround` = CONCAT(IF(SUBSTR(`turnaround`, 1, 1) = '+', '+', ''), CAST(`turnaround` AS SIGNED) + " + $adjust_by + ") WHERE 1";
Neodan
  • 5,154
  • 2
  • 27
  • 38
  • Great, will do. Could you provide a little explanation of what this is doing exactly? My SQL knowledge is a little limited and I don't fully understand it. Thank you. – MP_Webby Sep 06 '17 at 11:53
  • @Neodan WHERE 1? you can leave out this because it's always true – Raymond Nijland Sep 06 '17 at 12:00
  • This is the correct answer that removes PHP or any other language from equation. It converts the value in textual column to an integer, increments it and saves it. It's basically a nice fix for terrible database schema. +1 from me! – Mjh Sep 06 '17 at 12:03
  • Thanks for the explanation @Neodan, I was about to ask about the `WHERE 1`. How does this factor into it? Is this referring to `IF(SUBSTR)...` is true? – MP_Webby Sep 06 '17 at 12:04
  • @RaymondNijland I know, it is just an old habit (earlier it was necessary for slightly better performance). – Neodan Sep 06 '17 at 12:07
  • 1
    @MP_Webby no, `WHERE 1` just means that MySQL must take all records from the table. It is old trick for sligthly better performance, because if your query does not have `WHERE` section, then MySQL adds it by self. – Neodan Sep 06 '17 at 12:13
  • Yes, speed of sql is more than php! – Amir Fo Sep 06 '17 at 14:18
1

can't you just say

UPDATE whatevertable SET whatever = whatever + 1?

Try it and see, I'm pretty sure it will work!

EDIT: You have strings OR integers? Your DB design is flawed, this probably won't work, but would have been the correct answer had your DB design been more strict.

delboy1978uk
  • 12,118
  • 2
  • 21
  • 39
  • Yes I know, this would have been the perfect solution if the DB and codebase were properly designed. Sadly this is what I have inherited. Major design issues all over the place = Major headache... – MP_Webby Sep 06 '17 at 11:46
  • "Inherited" means that you need to do come cleanup before you continue. It _should not_ mean that you are stuck with the crappy design. – Rick James Sep 06 '17 at 17:55
0

You probably don't have, but need, this 'composite' index (in either order):

INDEX(optionid, turnaround)

Please provide SHOW CREATE TABLE.

Another, slight, performance boost is to explicitly LOCK TABLE WRITE before that update loop. And UNLOCK afterwards. Caution: This only applies to MyISAM.

You would be much better off with InnoDB.

Rick James
  • 135,179
  • 13
  • 127
  • 222