3

Trying the following UPSERT in PG (nodejs):

WITH upsert_result AS(
    INSERT INTO table1 (col1, col2)
    VALUES($1,$2)
    ON CONFLICT DO NOTHING
    RETURNING *
)
SELECT * FROM upsert_result
UNION
SELECT * FROM table1 WHERE col1=$1

Why do I sometimes get empty results?

It is my understanding that the UNION here was supposed to prevent this by fetching the conflicting row.

Here is the table definition:

CREATE EXTENSION IF NOT EXISTS "ltree";

CREATE TABLE IF NOT EXISTS table1(
    col1 TEXT NOT NULL UNIQUE REFERENCES table2 (col1) ON DELETE CASCADE,
    col2 ltree NOT NULL
);
CREATE INDEX IF NOT EXISTS tree_col2_idx ON table1 USING gist (col2);

I expected the query to return either the newly inserted row (if a row with same col1 doesn't already exist) or the existing row (if a row with the same col1 already exists). Indeed, this is what happens most of the time, but on some rare occasions, I get an empty result from the insert above.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Marumba
  • 359
  • 3
  • 12
  • 1
    if there is a conflict `on conflict do nothing` nothing will happen and thus no rows are returned afaik – Mr.P Dec 19 '22 at 13:46
  • 1
    to be sure of the expected behaviour, please provide some sample data and the exact expected results. – Jim Jones Dec 19 '22 at 14:20
  • 1
    Add the table definition of `table1` to the question, including the constraint `ON CONFLICT` is referencing. – Adrian Klaver Dec 19 '22 at 15:46

1 Answers1

2

There is a hidden race condition.

The check for conflicts includes (has to include!) rows from (yet) uncommitted concurrent transactions. If one has written to a row which your transaction now tries to UPSERT, your transaction has to wait for the other one to finish.

If the other transaction ends normally, your INSERT will detect a conflict and DO NOTHING, hence also not return the row. The SELECT sees the same snapshot from the start of the query and also cannot return the yet invisible row.

Read a detailed assessment here, especially chapter "Concurrency issue 1":

... and how to overcome it.

Aside: UNION ALL, not UNION. But that's also in the linked answer.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Does the UNION part occur after the other transaction (the concurrent insert) finishes? If so, why does it not see the updated snapshot? – Marumba Dec 25 '22 at 07:36
  • @Marumba With defailt "READ COMMITTED" isolation level, like its name inicates, only what has been committed at the start of each command is visible. All CTEs together with the outer `SELECT` form a *single* command and cannot see row versions committed later. So while the `UNION` part chronologically happens after the other transaction has committed, what's written by that other transaction remains invisible until the next command is started. Follow the provided link for more details. – Erwin Brandstetter Dec 25 '22 at 09:05