Our application has a table called cargo_items. It can be seen as kind of a queue to later process these items. Initially there was a SINGLE job which took 3000 entries and processed them one after another. Later on, somebody decided to start 3 other instances of the same job. What happened is quite obvious, many of the items were processed twice.
My job is to make these processes working correctly if at the same time many instances are running. The solution I am going for now is to mark 3000 entries in the database with a job_id and later on fetch all of these entities and process them isolated from the other processes.
My current approach for flagging this rows is the following:
UPDATE cargo_item item
SET job_id = 'SOME_UUID', job_ts = now()
FROM (
SELECT id
FROM cargo_item
WHERE state='NEW' AND job_id is null
LIMIT 3000
FOR UPDATE
) sub
WHERE item.id = sub.id;
Basically this approach locks the 3000 rows for a update. I am not sure if it is a good approach though.
On another thread I read about using advisory locks for this scenario.
What do you guys think about the current approach and using advisory lock instead?
UPDATE
As suggested, I'd adapt the update statement like this:
UPDATE cargo_item item
SET job_id = 'SOME_UUID', job_ts = now()
FROM (
SELECT id
FROM cargo_item
WHERE state='NEW' AND job_id is null
ORDER BY id
LIMIT 3000
FOR UPDATE
) sub
WHERE item.id = sub.id;
Thx Erwin and Tometzky for the hint. Nevertheless I wonder if the way I am trying to solve the problem is a good one? Are there different approaches you'd think of?