0

I have gone through almost all existing questions those are similar to this but did not find the answer for my question. Sorry if I have missed already posted questions those would answer this.

I have a MySQL table which I am using as a job queue. There are multiple workers which read jobs from these table.

The challenge is how to achieve this using MySQL queries on table.

I have to select rows and simultaneously update the job status. This should be automatic so no worker gets an already processing job.

I want to run following automatically (psuedo code):

select name, job_type from jobs where job_status = "created" limit 10;

foreach row {
    update table jobs set job_status = "processing" where id = '$id';
}

Is this possible in MySQL using queries/stored procedures/cursor?

Undo
  • 25,519
  • 37
  • 106
  • 129
  • It won't be a nice idea to use MySQL for this task; since you must perform locking operations to take care of race condition (due to multiple workers) and sometimes fall into deadlock. – akm Sep 21 '15 at 04:19

2 Answers2

0

I believe that this is a solution for your case:

update jobs set job_status = "processing"
where id = '$id'
and job_status = "created"
Rodrigo Gomes
  • 346
  • 1
  • 5
  • Thank you very much for the reply!! The query should return the rows after updating the job status for all the selected rows. Your query just updates already known row by id. – Sachin Patil Sep 21 '15 at 03:33
0

These techniques rely on transactions, so be sure autocommit is off. You shouldn't be using autocommit anyway for performance and data integrity reasons.

This also relies on using the InnoDB table format. MyISAM does not support row level locking.

Use SELECT ... FOR UPDATE to put an exclusive lock on the returned rows. The lock will remain until the transaction is committed or rolled back.

select id, name, job_type
from jobs
where job_status = "created"
limit 10
for update;

foreach row {
    update table jobs set job_status = "processing" where id = '$id';
    ...process...
    delete from jobs where id = '$id';
}

commit

Unless you have a good reason to do so, you're better off just grabbing one row at a time. It's a simple, fast query and there's no reason to hold onto 10. In fact, if you hold onto 10 you can't commit after each successful job.

select id, name, job_type
from jobs
where job_status = "created"
limit 1
for update;

update table jobs set job_status = "processing" where id = '$id';
...process the job...
delete from jobs where id = '$id';

commit

We can cut this down even further. There's no need to set the job to processing, other transactions won't see the change anyway. We can rely on the exclusive lock.

select id, name, job_type
from jobs
where job_status = "created"
limit 1
for update;

...process $id...

delete from jobs where id = '$id';

commit

This is robust. If your worker crashes its lock will be removed and another worker can try the job again.


Alternatively, you can update one at time. This requires getting the ID of the row you just updated.

SET @update_id := 0;

UPDATE jobs
SET job_status = "processing", id = (SELECT @update_id := id)
WHERE job_status = "created"
LIMIT 1;

SELECT @update_id;

...do work on @update_id...

DELETE FROM jobs WHERE id = @update_id

COMMIT

This relies on UPDATE setting an exclusive lock on each updated row, other transactions will not be able to update that row. This is why it's a good idea to work on one at a time.


Alternatively, add a queued job status and which process owns it.

UPDATE jobs
SET job_status = "queued", job_owner = "$pid"
WHERE job_status = "created" limit 10;

COMMIT;

SELECT name, job_type
FROM jobs
WHERE job_status = "queued"
  AND job_owner = "$pid"

foreach row {
    UPDATE jobs SET job_status = "processing" where id = '$id';
    ... process ...
    DELETE FROM jobs WHERE id = '$id';
    COMMIT;
}

The downside of this approach is if a worker dies it will still own jobs. I would only recommend this approach if you have a long-lived master which controls the queue and assigns jobs to workers.


Finally, there are subtle problems with using MySQL as a queue. Consider getting a real queuing service.

Community
  • 1
  • 1
Schwern
  • 153,029
  • 25
  • 195
  • 336
  • Thank you very much for the reply.. certainly it looks doable now. Does this select ... for update locks the rows for read also? I don't have good experience on cursors, what do you mean by throwing cursor. For second approach(processing single job at a time) we need to run three separate queries right? – Sachin Patil Sep 21 '15 at 03:58
  • @SachinPatil I was mistaken, `SELECT FOR UPDATE` exclusively locks for the duration of the transaction. I've extensively updated the answer. – Schwern Sep 21 '15 at 04:42
  • Thank you very much, you have elaborated in greater details. You have opened multiple ways to achieve this, I am more inclined to process single job at a time without locks as you mentioned in second approach. – Sachin Patil Sep 21 '15 at 06:29