What isolation level should I use for workers pulling from a queue of jobs to run?

Question

I have a PostgreSQL database (v9.5.3) that's hosting "jobs" for workers to pull, run, and commit back.

When a worker wants a job, it runs something to the effect of:

SELECT MIN(sim_id) FROM job WHERE job_status = 0;
-- status 0 indicates it's ready to be run

job is a table with this schema:

CREATE TABLE commit_schema.job (
  sim_id serial NOT NULL,
  controller_id smallint NOT NULL,
  controller_parameters smallint NOT NULL,
  model_id smallint NOT NULL,
  model_parameters smallint NOT NULL,
  client_id smallint,
  job_status smallint DEFAULT 0,
  initial_glucose_id smallint NOT NULL
);

Afterwards, it uses this sim_id to piece together a bunch of parameters in a JOIN:

SELECT a.par1, b.par2 FROM
a INNER JOIN b ON a.sim_id = b.sim_id;

These parameters are then return to the worker, along with the sim_id, and the job is run. The sim_id is locked by setting job.job_status to 1, using an UPDATE:

UPDATE job SET job_status = 1 WHERE sim_id = $1;

The results are then committed using that same sim_id.

Ideally,

Workers wouldn't under any circumstances be able to get the same sim_id.
Two workers requesting a job won't error out, one will just have to wait to receive a job.

I think that using a serializable isolation level will ensure that the MIN() always returns unique sim_id's, but I believe this may also be achievable using a read committed isolation level. Then again, MIN() may not be able to concurrently and deterministically give unique sim_id's to two concurrent workers?

score 3 · Accepted Answer · edited Jun 20 '20 at 09:12

3

This should work just fine for concurrent access using the default isolation level Read Committed and FOR UPDATE SKIP LOCKED (new in pg 9.5):

UPDATE commit_schema.job j
SET    job_status = 1
FROM  (
   SELECT sim_id
   FROM   commit_schema.job
   WHERE  job_status = 0
   ORDER  BY sim_id
   LIMIT  1
   FOR    UPDATE SKIP LOCKED
   ) sub
WHERE  j.sim_id = sub.sim_id
RETURNING sim_id;

job_status should probably be defined NOT NULL.

Be wary of certain corner cases - detailed explanation in this related answer on dba.SE:

Postgres UPDATE … LIMIT 1

To address your comment

There are various ways to return from a function:

Make it a simple SQL function instead of PL/pgSQL.
Or use RETURN QUERY with PL/pgSQL.
Or assign the result to a variable with RETURNING ... INTO - which can be an OUT parameter, so it will be returned at the end of the function automatically. Or any other variable and return it explicitly.

Related (with code examples):

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 02 '16 at 00:21

Erwin Brandstetter

605,456
145
1,078
1,228

Quick question, does this "Returning" let me place this in a function? If my other functions are in plpgsql and expecting a table from the function that performs this, can this just be added as another column via the OUT clause in the declaration? Would it be something like `RETURNING sub.sim_id INTO function_sim_id`? Also in the event that there are no more jobs, what would happen? – ijustlovemath Aug 03 '16 at 19:31
@ijustlovemath: Yes. Consider the addendum. – Erwin Brandstetter Aug 04 '16 at 00:58
One final question, if there are no more rows with job_status_id = 0, can I simply do an "IF sim_id IS NULL THEN foo()" after the UPDATE to handle that gracefully? Or is there a better way to handle it? – ijustlovemath Aug 09 '16 at 13:05
1

@ijustlovemath: I suggest `IF NOT FOUND THEN foo(); END IF;` in plpgsql code. Also consider details in the last chapter of [the linked answer](http://dba.stackexchange.com/questions/69471/postgres-update-limit-1/69497#69497). – Erwin Brandstetter Aug 09 '16 at 14:51

What isolation level should I use for workers pulling from a queue of jobs to run?

1 Answers1

To address your comment