64

I have a simple table in PostgreSQL that has three columns:

  • id serial primary key
  • key varchar
  • value varchar

I have already seen this question here on SO: Insert, on duplicate update in PostgreSQL? but I'm wondering just how to get the id if it exists, instead of updating. If the standard practice is to always either "insert" or "update if exists", why is that? Is the cost of doing a SELECT (LIMIT 1) greater than doing an UPDATE?

I have the following code

INSERT INTO tag
    ("key", "value")
    SELECT 'key1', 'value1'
WHERE
    NOT EXISTS (
        SELECT id,"key","value" FROM tag WHERE key = 'key1' AND value = 'value1'
    );

which works in the sense that it doesn't insert if exists, but I'd like to get the id. Is there a "RETURNING id" clause or something similar that I could tap in there?

Community
  • 1
  • 1
aright
  • 2,014
  • 2
  • 17
  • 16
  • 3
    possible duplicate of [Return id if a row exists, INSERT otherwise](http://stackoverflow.com/questions/10057668/return-id-if-a-row-exists-insert-otherwise) – Erwin Brandstetter May 07 '14 at 20:50
  • 2
    Both the question and the answers here are much better voted than the ones there and the number of visits here is much higher so if some question must be closed it is the other not this one. – Clodoaldo Neto May 09 '14 at 23:55
  • 1
    **FYI** Postgres 9.5 brings an [UPSERT](https://en.wikipedia.org/wiki/Merge_(SQL)) feature. See [blog post by Craig Kerstiens](http://www.craigkerstiens.com/2015/05/08/upsert-lands-in-postgres-9.5/). – Basil Bourque Jul 17 '15 at 02:41
  • Possible duplicate of [Postgres: INSERT if does not exist already](https://stackoverflow.com/questions/4069718/postgres-insert-if-does-not-exist-already) – Gajus Oct 22 '18 at 15:47

3 Answers3

95

Yes there is returning

INSERT INTO tag ("key", "value")
SELECT 'key1', 'value1'
WHERE NOT EXISTS (
    SELECT id, "key", "value"
    FROM node_tag
    WHERE key = 'key1' AND value = 'value1'
    )
returning id, "key", "value"

To return the row if it already exists

with s as (
    select id, "key", "value"
    from tag
    where key = 'key1' and value = 'value1'
), i as (
    insert into tag ("key", "value")
    select 'key1', 'value1'
    where not exists (select 1 from s)
    returning id, "key", "value"
)
select id, "key", "value"
from i
union all
select id, "key", "value"
from s

If the row does not exist it will return the inserted one else the existing one.

BTW, if the pair "key"/"value" makes it unique then it is the primary key, and there is no need for an id column. Unless one or both of the "key"/"value" pair can be null.

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • 5
    I realized at closer inspection that this solution does not actually do what I was hoping since it does not return anything unless the INSERT works (i.e. if the key-value pair was not previously in the table). If the row was already in the table, it just returns blank (not the ID as intended). The reason I missed that earlier was that I accidentally made a typo when I was testing it (so that key-value pair wasn't in the table, and therefore it returned the ID of the row). – aright Aug 12 '13 at 22:20
  • 1
    BTW: could be `UNION ALL` , IMHO. – wildplasser Aug 13 '13 at 00:43
  • 1
    @wildplasser Notice that there will be always only one row returned regardless of the use of `all`. – Clodoaldo Neto Aug 13 '13 at 01:38
  • Semantically you're right. But I seriously doubt if the optimiser will realise it. (it is futile, of course) – wildplasser Aug 13 '13 at 01:40
  • @wildplasser Do you mean there could be a relevant difference in cost doing a distinct between a single row set and an empty set versus a union all? My wild guess is that you are conceptually correct so I'm changing it. – Clodoaldo Neto Aug 13 '13 at 01:50
  • 1
    For a single row it will be neglectible. But it could influence the plan generation/selection (which would only be detectible if more than one rowe were involved) Removing the duplicates is not that costly _ansich_ but it could prossibly force the two subplans to yield their `RETURNING` in a particular order (which is not needed) – wildplasser Aug 13 '13 at 01:59
  • @clodoaldo: Thanks for the answer, the revised version worked fine! The thing about not needing an id column is that I have a "connection table" that links a "tag" to an entity. I basically have three tables: entity, tag and entity_tag, and in the entity_tag table I have two columns: entity_id and tag_id. The end result that there is a many-to-many relationship between tags and entities, therefore the "connection table" entity_tag and therefore I need the id, do you agree with my reasoning would you say I'm breaking some best-practice rule? – aright Aug 14 '13 at 12:10
  • @ClodoaldoNeto: What is `node_tag` in your answer referring to? – stackoverflowuser2010 May 07 '15 at 05:26
  • @ClodoaldoNeto: And why do you put double quotes around the column names? – stackoverflowuser2010 May 07 '15 at 05:30
  • @stackoverflowuser2010 I did not bother to check if they were [reserved key words](http://www.postgresql.org/docs/current/static/sql-keywords-appendix.html#KEYWORDS-TABLE). Indeed they are not in Postgresql so it is safe to remove the double quotes. – Clodoaldo Neto May 07 '15 at 11:44
  • @stackoverflowuser2010 about `node_tag`, if you check the [question revisions](http://stackoverflow.com/posts/18192840/revisions) you will see that was the table name in the first question version. – Clodoaldo Neto May 07 '15 at 11:53
  • This is cool, but is it possible to make it work with executeBatch() or any similar approach for bulk insert? executeBatch() can't use RETURNING clause as I know. Thank you. – sermilion Jul 29 '15 at 07:36
  • Works well! But in the end you don't know if your query INSERTed something or not. The status message of the query is always `SELECT 1` since it's always the last two selects that are executed last. Is there a way to know if something was inserted or not? – jimis Sep 16 '22 at 15:44
6
with vals as (
  select 'key5' as key, 'value2' as value
)
insert into Test1 (key, value)
select v.key, v.value
from vals as v
where not exists (select * from Test1 as t where t.key = v.key and t.value = v.value)
returning id

sql fiddle demo

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
-1

And you can store value returned to variables in form of ... RETURNING field1, field2,... INTO var1, var2,...

RETURNING will normally return a query which would return Error 'query has no destination for result data' if you call it in plpgsql without using its returned result set.

Wutikrai
  • 39
  • 3