0

I'm trying to lock a row in a table as being "in use" so that I don't process the data twice when my cron runs every minute. Because of the length of time it takes for my script to run, the cron will cause multiple instances of the script to run at once (usually around 5 or 6 at a time). For some reason, my "in use" method is not always working.

I do not want to LOCK the tables because I need them available for simultaneous processing, that is why I went the route of pseudo-locking individual rows with an 'inuse' field. I don't know of a better way to do this.

Here is an illustration of my dilemma:

 <?
    //get the first row from table_1 that is not in use
    $result = mysqli_query($connect,"SELECT * FROM `table_1` WHERE inuse='no'");
    $rows = mysqli_fetch_array($result, MYSQLI_ASSOC);
    $data1 = $rows[field1];

    //"lock" our row by setting inuse='yes'
    mysqli_query($connect,"UPDATE `table_1` SET inuse='yes' WHERE field1 = '$data1'");

    //insert new row into table_2 with our data if it doesn't already exist
    $result2 = mysqli_query($connect,"SELECT * FROM `table_2` WHERE field='$data2'");
    $numrows = mysqli_num_rows($result2);
    if($numrows >= 1) { 
      //do nothing
    } else {
      //run some unrelated script to get data
      $data2 = unrelatedFunction();

      //insert our data into table_2
      mysqli_query($connect,"INSERT INTO `table_2` (field) value ('$data2')");
    }

    //"unlock" our row in table_1
    mysqli_query($connect,"UPDATE `table_1` SET inuse='no' WHERE field1 = '$data1'");
 ?>

You'll see here that $data2 won't be collected and inserted if a row already exists with $data2, but that part is for error-checking and does not answer my question as the error still occurs. I'm trying to understand why (if I don't have that error-check in there) my 'inuse' method is sometimes being ignored and I'm getting duplicate rows in table_2 with $data2 in them.

zoltar
  • 706
  • 2
  • 7
  • 20
  • 1
    Rewrite the query: `SELECT * FROM table_1 WHERE inuse='no'` into `SELECT * FROM table_1 WHERE inuse='no' FOR UPDATE`, this will lock records when SELECT retrieves row and prevent other session from getting the same result. – krokodilko Sep 08 '13 at 22:13
  • 1
    @kordirko: Note that this will require a database engine that supports locking (e.g, InnoDB). But you shouldn't be using MyISAM anyway! –  Sep 08 '13 at 22:13
  • @duskwuff, thank you for the remark, you are right, SELECT FOR UPDATE will work only on InnoDb, and only on isolation levels different than `read uncommitted` – krokodilko Sep 08 '13 at 22:21
  • Where do you get `$data2` from initially? Is it also a field from the first query? – pilcrow Sep 09 '13 at 15:43
  • 1
    Relevant: http://stackoverflow.com/questions/4109414/ignoring-locked-row-in-a-mysql-query – pilcrow Sep 09 '13 at 16:13

1 Answers1

1

There's a lot of time in between your first select and the first update where another process can do the same operation. You're not using transaction either, so you're not guaranteeing any order of the changes becoming visible to others.

You can either move everything into a transaction with the isolation level you need and use SELECT .... FOR UPDATE syntax. Or you can try doing the copy in a different way. For example update N rows that you want to process and SET in_use=your_current_pid WHERE in_use IS NULL. Then you can read back the rows you manually marked for processing. After you finish, reset in_use to NULL.

viraptor
  • 33,322
  • 10
  • 107
  • 191