0

I have the following query

INSERT INTO address (house_number, street, city_id)
    values(11, 'test st', (select id from city where LOWER(city) = LOWER('somecity')))

Is there anyway to insert "somecity" in the city table if "somecity" does not exist in city then after inserting, it would return the ID for the inserted row?

I did find this answer that says upsert can be used to achieve this

https://stackoverflow.com/a/31742830/492015

but I can't find an example that inserts if select does not return the row.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
Arya
  • 8,473
  • 27
  • 105
  • 175

1 Answers1

1

Instead of nesting the INSERTs, you could use a CTE to perform the INSERTs one after the other but as a single statement:

WITH tmp AS (
    INSERT INTO test_city (city) VALUES ('somecity')
    ON CONFLICT (lower(city)) DO UPDATE SET city = excluded.city
    RETURNING id, city
)
INSERT INTO test_address (house_number, street, city_id)
SELECT house_number, street, id
FROM (VALUES (11, 'test st', 'somecity')) val (house_number, street, city)
LEFT JOIN tmp USING (city)
RETURNING *

Using this setup:

DROP TABLE IF EXISTS test_address;
DROP TABLE IF EXISTS test_city;
CREATE TABLE test_address (
    house_number int
    , street text
    , city_id int
    );
CREATE TABLE test_city (
    id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
    , city text 
    );
CREATE UNIQUE INDEX test_city_uniq_idx ON test_city USING btree (lower(city));
INSERT INTO test_city (city) VALUES ('Somecity');

and with the INSERT above, the query

SELECT * FROM test_address;

yields

| house_number | street  | city_id |
|--------------+---------+---------|
|           11 | test st |       1 |

and

SELECT * FROM test_city;

yields

| id | city     |
|----+----------|
|  1 | somecity |

Note that the CTE replaces

(select id from city where LOWER(city) = LOWER('somecity'))

with an INSERT .. ON CONFLICT .. DO UPDATE statement:

INSERT INTO test_city (city) VALUES ('somecity')
ON CONFLICT (lower(city)) DO UPDATE SET city = excluded.city
RETURNING id, city

I used DO UPDATE instead of DO NOTHING so that RETURNING id, city will always return something. If you use DO NOTHING, then nothing is returned when there is a conflict.

Note however that a consequence of using city = excluded.city is that the original 'Somecity' gets replaced by 'somecity'. I'm not sure you'll find that behavior acceptable, but unfortunately I haven't figured out how to do nothing when there is a conflict and yet return id and city at the same time.


Another issue you may have with the above solution is that I used a unique index on lower(city):

CREATE UNIQUE INDEX test_city_uniq_idx ON test_city USING btree (lower(city));

This allows you to use the identical condition in the INSERT statement:

INSERT ... ON CONFLICT (lower(city))

as a substitute for the condition LOWER(city) = LOWER('somecity') which appeared in your SELECT statement. It produces the desired effect, but the trade-off is that now you have a unique index on (lower(city)).


Regarding the followup question of how to insert into more than 2 tables:

You can chain together more than one CTE, and the subsequent CTEs can even reference the prior CTEs. For example,

CREATE UNIQUE INDEX city_uniq_idx ON city USING btree (lower(city));
CREATE UNIQUE INDEX state_uniq_idx ON state USING btree (lower(state_code));

WITH tmpcity AS 
(
   INSERT INTO
      city (city) 
   VALUES
      (
         'Miami'
      )
      ON CONFLICT (lower(city)) DO 
      UPDATE
      SET
         city = excluded.city RETURNING id, city
)
, tmpstate as 
(
   INSERT INTO
      state (state_code) 
   VALUES
      (
         'FL'
      )
      ON CONFLICT (lower(state_code)) DO 
      UPDATE
      SET
         state_code = excluded.state_code RETURNING id, state_code
)
INSERT INTO
   address (house_number, street, city_id, state_id) 
   SELECT
      house_number,
      street,
      tmpcity.id,
      tmpstate.id 
   FROM
      (
      VALUES
         (
            12,
            'fake st.',
            'Miami',
            'FL'
         )
      )
      val (house_number, street, city, state_code) 
      LEFT JOIN
         tmpcity USING (city) 
      LEFT JOIN
         tmpstate USING (state_code)
         ON CONFLICT (street) DO NOTHING
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Thanks for your answer, I'm using your first example with CTE, it's working well, how would I add more tables like city to it? I also have a state table and a zipcode table. – Arya Apr 30 '19 at 18:00
  • You can [chain together more than one CTE](https://stackoverflow.com/a/21387005/190597), and the subsequent CTEs can even reference the prior CTEs. – unutbu Apr 30 '19 at 18:12
  • This is what I constructed, something tells me I messed it up. https://pastebin.com/jejQ51di – Arya Apr 30 '19 at 19:39
  • I think you are almost there, except the VALUES tuple should repeat `'Miami', 'FL'`, and the join condition should be `LEFT JOIN tmpstate USING (state_code)`. I've edited post above to show what I mean. – unutbu Apr 30 '19 at 20:05
  • Thanks, that works, but I ran into another complication when adding the CTE for the county table. Since there are cases where the same name for a county could belong to more than one state, I have to create an index on two columns for the county table. `CREATE UNIQUE INDEX county_uniq_idx ON county USING btree (lower(county), state_id);` But `ON CONFLICT` only supports one column. I tried using ON CONFLICT ON CONSTRAINT shown here https://pastebin.com/BTd4hSZT, but unfortunately that does not work. What do you think is the best solution? – Arya May 01 '19 at 21:16
  • Nevermind, it looks like ON CONFLICT does support two columns – Arya May 02 '19 at 01:02
  • Another question if you don't mind. When chaining CTE, how do I access the id from the last CTE? for example https://pastebin.com/jdihGePs I'm using tmpstate.id to get the id, is that the correct way of doing it? – Arya May 02 '19 at 04:21
  • Use the [`INSERT .. SELECT .. FROM table` pattern](https://dba.stackexchange.com/a/46477/2037), where `table` is a `LEFT JOIN` between `VALUES` and `tmpstate`. Above, the last SQL example uses this pattern to insert `tmpcity.id` and `tmpstate.id`. An `INSERT` inside a CTE has exactly the same syntax as an `INSERT` outside a CTE. – unutbu May 02 '19 at 10:56
  • What do you think if I use `citext` datatype instead of `text`? it's pretty much a case insensitive `text` datatype. – Arya Jul 03 '19 at 14:01