0

I'm using Spring with PostgreSQL and I try to do a sort of UPSERT by using a code like this:

jt.update("delete from A where id = 1")
jt.update("insert into A (id, value) values (1, 100)")

wrapped inside a transaction (using @Transactional).

The problem is that, when there are many concurrent requests, this code fails with 'duplicate key' errors, meaning the transaction is not isolated, or...

Am I missing something about how transactions work? Should I use a different mechanism here (ex. thread synchronization)?

Mari9
  • 104
  • 2
  • 8

2 Answers2

11

I wrote a rather large blogpost about it, so even though I might get downvotes for links, read this.

The gist is that transactions do not help in here (at least by default), and while it is possible to write correct upsert, it's actually pretty tricky.

  • Huh, I already sprinkle links to that post around rather liberally. It's a very useful resource, thanks. – Craig Ringer Jun 18 '13 at 07:26
  • @CraigRinger: I once (couple of months ago) provided answer by linking to my blogpost, which was apparently "not ok". Don't remember details now, so can't provide link. –  Jun 18 '13 at 13:23
  • @CraigRinger: Actually, depesz is right about `barely more than a link to an external site`. It's in the [list of criteria why answers may be deleted](http://stackoverflow.com/help/deleted-answers). Now, *we* know that depesz is an expert in the field, and that the link is good. The rules apply for everybody, though. – Erwin Brandstetter Jun 21 '13 at 17:09
  • 1
    @ErwinBrandstetter Good point; I tend to elaborate considerably on the answer, not *just* link, and I do think that policy is valuable. Not least because external links move/vanish. What I tend to do in this case is have one full-size answer with some links to external articles for supporting detail, then link to that answer with a very short explanation when similar issues come up later. I find that works well. – Craig Ringer Jun 22 '13 at 01:45
5

Assuming this simple table:

CREATE TABLE tbl(id int primary key, value int);

This function almost 100% secure (see comments) for concurrent transactions.:

CREATE OR REPLACE FUNCTION f_upsert(_id int, _value int)
  RETURNS void AS
$func$
BEGIN
LOOP
   UPDATE tbl SET value = _value WHERE  id = _id;

   EXIT WHEN FOUND;

   BEGIN
      INSERT INTO tbl (id, value)
      VALUES (_id, _value);

      RETURN;

   EXCEPTION WHEN UNIQUE_VIOLATION THEN     -- tbl.id has UNIQUE constraint.
      RAISE NOTICE 'It actually happened!'; -- hardly ever happens
   END;

END LOOP;
END
$func$ LANGUAGE plpgsql;

Call:

SELECT f_upsert(2, 2);

It's very similar to this INSERT / SELECT case with more explanation and links:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    I'm not convinced it's "100% safe", but it doesn't suffer from the usual obvious faults like lost updates, racing on an `EXISTS` test, etc. `UPDATE` will block on an uncommitted `INSERT`s then the re-check will cause it to return zero rows, so you handle the rollback-after-insert problem. I think it's safe if the function is the *only* thing you do in the transaction. You can still lose updates if you try to do more than one upsert per transaction. It'll also only work correctly in `READ COMMITTED` mode. – Craig Ringer Jun 24 '13 at 03:27
  • 2
    It's the same approach as from http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING btw – Craig Ringer Jun 24 '13 at 03:36
  • @CraigRinger: Very good link. The [`UPSERT` example](http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE) uses almost an identical code path, so the notes fit perfectly. – Erwin Brandstetter Jun 24 '13 at 11:02