36

Writable CTEs were considered a solution to UPSERT prior to 9.5 as described in Insert, on duplicate update in PostgreSQL?

It is possible to perform an UPSERT with the information whether it ended up as an UPDATE or an INSERT with the following Writable CTEs idiom:

WITH
    update_cte AS (
        UPDATE t SET v = $1 WHERE id = $2 RETURNING 'updated'::text status
    ),
    insert_cte AS (
        INSERT INTO t(id, v) SELECT $2, $1 WHERE NOT EXISTS
            (SELECT 1 FROM update_cte) RETURNING 'inserted'::text status
    )
 (SELECT status FROM update_cte) UNION (SELECT status FROM insert_cte)

This query will return either "updated" or "inserted", or may (rarely) fail with a constraint violation in as described in https://dba.stackexchange.com/questions/78510/why-is-cte-open-to-lost-updates

Can something similar be achieved using PostgreSQL 9.5+ new "UPSERT" syntax, benefiting from its optimization and avoiding the possible constraint violation?

Community
  • 1
  • 1
Paul Guyot
  • 6,257
  • 1
  • 20
  • 31

4 Answers4

36

I believe xmax::text::int > 0 would be the easiest trick:

so=# DROP TABLE IF EXISTS tab;
NOTICE:  table "tab" does not exist, skipping
DROP TABLE
so=# CREATE TABLE tab(id INT PRIMARY KEY, col text);
CREATE TABLE
so=# INSERT INTO tab(id, col) VALUES (1,'a'), (2, 'b');
INSERT 0 2
so=# INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col
returning *,case when xmax::text::int > 0 then 'updated' else 'inserted' end,ctid;
 id | col  |   case   | ctid
----+------+----------+-------
  3 | c    | inserted | (0,3)
  4 | d    | inserted | (0,4)
  1 | aaaa | updated  | (0,5)
(3 rows)

INSERT 0 3
so=# INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col
returning *,case when xmax::text::int > 0 then 'updated' else 'inserted' end,ctid;
 id | col  |  case   | ctid
----+------+---------+-------
  3 | c    | updated | (0,6)
  4 | d    | updated | (0,7)
  1 | aaaa | updated | (0,8)
(3 rows)

INSERT 0 3
Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
10

Drawing from @lad2025's answer, the result can be achieved by abusing settings and customized options with related functions in WHERE clauses to get a required side-effect.

CREATE TABLE t(id INT PRIMARY KEY, v TEXT);

INSERT INTO t (id, v)
    SELECT $1, $2
    WHERE 'inserted' = set_config('upsert.action', 'inserted', true)
    ON CONFLICT (id) DO UPDATE
        SET v = EXCLUDED.v
        WHERE 'updated' = set_config('upsert.action', 'updated', true)
RETURNING current_setting('upsert.action') AS "upsert.action";

The third parameter of set_config is is_local: true means the setting will go away at the end of transaction. More precisely, current_setting('upsert.action') will return NULL (and not throw an error) until the end of the session.

Community
  • 1
  • 1
Paul Guyot
  • 6,257
  • 1
  • 20
  • 31
  • 9
    Oh my, that's ugly! It makes my vision blur. I definitely do not want to remember this, let alone apply it!~ – Patrick Jan 29 '16 at 17:24
  • 1
    you can do it with `xmax::text::int > 0` - without transaction level setting (which is indeed original hack) – Vao Tsun Oct 29 '17 at 15:19
  • @VaoTsun is right. See https://stackoverflow.com/a/38858662/454126 for an example of how to use `xmax` for this purpose. Not pretty, but definitely prettier than `set_config` / `current_setting`. – Julian Mehnle Mar 15 '18 at 20:44
7

In SQL Server MERGE statement has $action that returns string 'INSERT', 'UPDATE', or 'DELETE'.

For Postgresql I can't find function/variable that does similiar thing for RETURNING.

One way to workaround it is to add column is_updated to your table:

DROP TABLE IF EXISTS tab;

CREATE TABLE tab(id INT PRIMARY KEY, col VARCHAR(100),
                 is_updated BOOLEAN DEFAULT false);
INSERT INTO tab(id, col) VALUES (1,'a'), (2, 'b');


-- main query
INSERT INTO tab(id, col)
VALUES (3, 'c'), (4, 'd'), (1,'aaaa')
ON CONFLICT (id) DO UPDATE SET col = EXCLUDED.col, is_updated = true
RETURNING id,col,
          CASE WHEN is_updated THEN 'UPDATED' ELSE 'INSERTED' END AS action;

Rextester Demo

Output:

╔════╦══════╦══════════╗
║ id ║ col  ║  action  ║
╠════╬══════╬══════════╣
║  3 ║ c    ║ INSERTED ║
║  4 ║ d    ║ INSERTED ║
║  1 ║ aaaa ║ UPDATED  ║
╚════╩══════╩══════════╝
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • Where is VALUES (2, 'b') in the output? – kometen Jan 13 '16 at 10:49
  • 1
    @kometen There is no need for it. It exists before. Why do you want to return entire table? Only records that are inserted/updated – Lukasz Szozda Jan 13 '16 at 10:50
  • But (2, 'b') is also inserted. And is_updated defaults to false. So I figured it would show up as INSERTED. – kometen Jan 13 '16 at 10:59
  • 1
    @kometen The point is this: imagine you have table with 2 milions records. Now you execute the main query. The returning will give back only 3 records (2 inserted, 1 updated). The first insert is to prepopulate data (for demo). – Lukasz Szozda Jan 13 '16 at 11:01
  • 1
    OK, so it's is relative to that particular INSERT. Very useful, thank you. – kometen Jan 13 '16 at 11:09
3

(xmax::text::bigint > 0) or (NOT xmax = 0). Typecast to integer will break once transaction count will reach the integer overflow.

Misha
  • 81
  • 1
  • 3