3

I needed UPSERT functionality in Postgres. Since Postgres does not support this natively, I wrote a function which does that (tries to update, if no row was updated then inserts)

This is the template for the function: https://stackoverflow.com/a/1109198/681671

CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
    LOOP
        -- first try to update the key
        UPDATE db SET b = data WHERE a = key;
        IF found THEN
            RETURN;
        END IF;
        -- not there, so try to insert the key
        -- if someone else inserts the same key concurrently,
        -- we could get a unique-key failure
        BEGIN
            INSERT INTO db(a,b) VALUES (key, data);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            -- do nothing, and loop to try the UPDATE again
        END;
    END LOOP;
END;
$$
LANGUAGE plpgsql;


SELECT merge_db(1, 'david');

I am using Spring JDBC template. This select statement (in its parameterized form) and an array of objects is what I pass to the the batchUpdate method of JDBCTemplate.

I get this exception:

 org.postgresql.util.PSQLException: A result was returned when none was expected.

I suspect it is because of use of SELECT.

I know I can use Callable in a loop but that would make the app very chatty and I/O latency will make it very slow.

How is batch upsert done in Postgres using Spring JDBC / raw JDBC?

I am using Postgresql 9.1.

Community
  • 1
  • 1
Dojo
  • 5,374
  • 4
  • 49
  • 79
  • 1
    See http://stackoverflow.com/search?q=%5Bpostgresql%5D%20upsert – Craig Ringer Jun 22 '13 at 01:57
  • @CraigRinger: Thanks will try that. But still looking for a confirmation that multiple parameterized calls to a procedure cannot be batched when using JDBC + Postgres. I saw addBatch method of Statement interface and it does something similar to what I want but does it not accept parameters. One has to serialize the parameters in application code and provide a SQL string. There are many parameters and escaping each may be prone to error. – Dojo Jun 22 '13 at 05:57
  • Could you show us how are you calling the function from Java? – MatheusOl Jun 22 '13 at 16:43

2 Answers2

7

Batch upsert is done in PostgreSQL by:

  • Beginning a transaction
  • Creating a TEMPORARY table
  • Populating it using JDBC batched INSERT or (preferably) by using the COPY API provided by the PgJDBC driver
  • LOCKing the real destination table IN EXCLUSIVE MODE, which allows only SELECTs from other transactions to continue.
  • Doing an UPDATE ... FROM to update existing rows
  • Doing an INSERT INTO ... SELECT ... WHERE NOT EXISTS (SELECT 1 FROM real_table WHERE ...) to add rows that aren't already in real_table
  • COMMITing the transaction

If multiple transactions attempt to do this they'll land up getting serialised on the table lock. An upsert isn't ever going to be a concurrency-friendly operation.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
1

Don't know much about postgres in particular, but my usual method is as follows :

  • do the insert
  • if it fails with unique_violation, do an update

I do it this way so that it minimizes locks on the table, and guards against race conditions.

Of course, it'll only work if the table you are inserting into has a unique constraint on columns other than the primary key.

DaveH
  • 7,187
  • 5
  • 32
  • 53