Does a SELECT
query following an INSERT … ON CONFLICT DO NOTHING
statement always find a row, given the default transaction isolation (read committed)?
I want to INSERT
-or-SELECT
a row in one table, then reference that when inserting rows in a second table. Since RETURNING
doesn't work well with ON CONFLICT
, I have so far used a simple CTE that should always give me the identity column value even if the row already exists:
$id = query(
`WITH ins AS (
INSERT INTO object (scope, name)
VALUES ($1, $2)
ON CONFLICT (scope, name) DO NOTHING
RETURNING id
)
SELECT id FROM ins
UNION ALL
SELECT id FROM object WHERE scope = $1 AND name = $2
LIMIT 1;`,
[$scope, $name]
)
query(
`INSERT INTO object_member (object_id, key, value)
SELECT $1, UNNEST($2::text[]), UNNEST($3::int[]);`
[$id, $keys, $values]
)
However, I learned that this CTE is not entirely safe under concurrent write load, where it can happen that both the upsert and the select come up empty when a different transaction does insert the same row.
In the answers there (and also here) it is suggested to use another query to do the SELECT
:
start a new command (in the same transaction), which then can see these conflicting rows from the previous query.
If I understand correctly, it would mean to do
$id = query(
`INSERT INTO object (scope, name)
VALUES ($1, $2)
ON CONFLICT (scope, name) DO NOTHING
RETURNING id;`,
[$scope, $name]
)
if not $id:
$id = query(
`SELECT id FROM object WHERE scope = $1 AND name = $2;`
[$scope, $name]
)
query(
`INSERT INTO object_member (object_id, key, value)
SELECT $1, UNNEST($2::text[]), UNNEST($3::int[]);`
[$id, $keys, $values]
)
or even shortened to
query(
`INSERT INTO object (scope, name)
VALUES ($1, $2)
ON CONFLICT (scope, name) DO NOTHING;`,
[$scope, $name]
)
query(
`INSERT INTO object_member (object_id, key, value)
SELECT (SELECT id FROM object WHERE scope = $1 AND name = $2), UNNEST($3::text[]), UNNEST($3::int[]);`
[$scope, $name, $keys, $values]
)
I believe this would be enough to prevent that particular race condition (dubbed "concurrency issue 1" in this answer) - but I'm not 100% certain not to have missed anything.
Also what about "concurrency issue 2"? If I understand correctly, this is about another transaction deleting or updating the existing row, inbetween the INSERT
and SELECT
statements - and it's more likely to happen when using multiple queries instead of the CTE approach. How exactly should I deal with that? I assume locking the SELECT
with FOR KEY SHARE
is necessary in the second code snippet - but would I also need that in the third snippet where the id
is used within the same query? If it helps to simplify the answer, let's assume an object
can only be inserted or deleted, but is never updated.