How to insert into two tables in Postgres using CTE in one SQL query

Question

I would like to add data to two tables in Postgres with one query using CTE. After a user submit data from a form in the frontend for table2, I want my SQL query to insert id, value1 in table1. Then the same id in table one will be used to create data of table2. When I tried the hard-coded value below in pgAdmin, the code works as the id was generated for table1 and used to create table2 data.

WITH ins AS (
    INSERT INTO table1 
     (post_type, created_on) 
    VALUES 
     ('keyword', 'NOW()')
    RETURNING pid)
    INSERT INTO table2 
    (pid, author, title, description, body, category, search_volume, is_deleted, created_on)
    VALUES
    ((SELECT pid FROM ins), 'jet12', 'Head', 'Head is on top', 'Head is the most important part of the body', 'Head', '10000', 'false', 'NOW()')

However, since I'll be using a form to populate the data, hard-coding will surely not work. I tried the code below but I can't seem to get around it.

WITH ins AS (
    INSERT INTO table1 
     (post_type, created_on) 
    VALUES 
     ('keyword', 'NOW()')
    RETURNING pid)
    INSERT INTO table2 
    (pid, author, title, description, body, category, search_volume, is_deleted, created_on)
    VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)

Please how can I write the query to accept parameters as values? Is there another way to go about it? See my full function below:

const keywordHandler = (req, res) => {
  const values = [req.body.pid, req.body.username, req.body.title, req.body.description, req.body.body, req.body.category, req.body.search_volume, req.body.is_deleted, req.body.created_on]
pool.query(`WITH ins AS (
    INSERT INTO table1 
     (post_type, created_on) 
    VALUES 
     ('keyword', 'NOW()')
    RETURNING pid)
    INSERT INTO table2 
    (pid, author, title, description, body, category, search_volume, is_deleted, created_on)
    VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)`, 
               values, (k_err, k_res) => {
                if (k_err) {
                    return console.error('Error executing query', k_err.stack)
                  }
           res.json({
                    status: 'Keyword submitted successfully',
                    data: k_res.rows
    });
     })
};

score 1 · Answer 1 · answered Dec 04 '19 at 23:36

1

The following should work.

WITH ins AS (
  INSERT INTO table1
    (post_type, created_on) 
  VALUES
    ('keyword', now())
  RETURNING pid
)
INSERT INTO table2 
  (pid, author, title, description, body, category, search_volume, is_deleted, created_on)
SELECT
  ins.pid, $1, $2, $3, $4, $5, $6, $7, $8
FROM ins;

An alternative—which I would heartily recommend if it's an option for you—is to use UUIDs for ids. Then you wouldn't need a CTE to pass values between the two statements at all; just generate the UUID at the app level and include it in both of your insert statements. Since the likelihood of generating a duplicate UUID is somewhere around "winning the lottery jackpot every day for a year" slim to none during your lifetime, it should be considered a safe bet with additional benefits.

answered Dec 04 '19 at 23:36

Miles Elam

1,440
11
19

A `uuid` is simpler - at a price. Occupies 16 bytes on disk where an `integer` from a plain serial only need 4 bytes. Harder for human handling, too. – Erwin Brandstetter Dec 04 '19 at 23:45
A 32-bit value is dangerous for overflow reasons. A 64-bit value should be considered a minimum except for lookup tables in 2019. But the plain serial also suffers from predictability (ability to guess next and previous IDs, which has proven to be a real data risk in the wild). Once indexed, the performance difference is negligible. The extra bytes on disk are typically a rounding error if that row also contains text of varying length. – Miles Elam Dec 04 '19 at 23:53
I agree on the point regarding human handling. It's a tradeoff, but one I strongly consider worth taking when the more human friendly solution has inherent security issues if exposed in an API or front end, which they typically are. – Miles Elam Dec 04 '19 at 23:53
@Miles Elam thanks for the suggestion. When I tried your code with hard-coded value in pgAdmin, it works. But when I used the parameters, I got the error ```there is no parameter $1``` – Jetro Olowole Dec 05 '19 at 00:35
@MilesElam thanks again for the suggestion. I'm using g uuid in my id. Can you give an example of how to generate the UUID at the app level and include it in both of my insert statements? – Jetro Olowole Dec 05 '19 at 00:48
@JetroOlowole Ah, I had assumed you were using a library that performed query parameterization. It sounds like this isn't the case. pgAdmin does not have built-in support for dynamic parameters to my knowledge. You would need to make a prepared statement first and then invoke it with the appropriate values. https://www.postgresql.org/docs/current/sql-prepare.html – Miles Elam Dec 05 '19 at 00:49
Every programming language has its own API for doing so. In Node it's the uuid module. Python, it's also a module called uuid. In Java it's java.util.UUID. The goal here is that you are generating the UUID outside the database ahead of time and then passing the new ID to both insert statements. – Miles Elam Dec 05 '19 at 00:56
Any new comment? – Jetro Olowole Dec 09 '19 at 04:51

Erwin Brandstetter · Answer 2 · 2019-12-05T01:31:55.633

Don't pass pid- it's generated by the first CTE, that's the whole point.

You could use a prepared statement (most programming languages have libraries wrapping this functionality) or create a server-side function to take parameters.

Demonstrating a simple SQL function returning void (nothing):

CREATE OR REPLACE FUNCTION f_my_insert(_username text
                                     , _title text
                                     , _description text
                                     , _body text
                                     , _category text
                                     , _search_volume text
                                     , _is_deleted bool       -- type?
                                     , _created_on timestamp) -- type?
  RETURNS void AS  --  or what do you want to return?
$func$
   WITH ins AS (
      INSERT INTO table1 
             (post_type, created_on) 
      VALUES ('keyword', now())  --  'keyword' hard-coded, now() hard-coded, too?
      RETURNING pid
      )
   INSERT INTO table2 
         (pid, author, title, description, body, category, search_volume, is_deleted, created_on)
   SELECT pid, $1    , $2   , $3         , $4  , $5      , $6           , $7        , $8           -- use ordinal references
-- SELECT pid,_author,_title,_description,_body,_category,_search_volume,_is_deleted,_created_on)  -- ... or parameter names
   FROM   ins
$func$  LANGUAGE sql;

Adapt data types in the function declaration to your actual needs.

Call:

SELECT f_my_insert('username1', 'title2', 'description3', 'body4', 'category5', 'search_volume6', false, now());

There are various ways to call a Postgres function:

Functions with variable number of input parameters

Using a SELECT for the 2nd INSERT instead of a VALUES expression only inserts if the first INSERT actually returned a row.

It's odd that you use a hard-coded 'NOW()' for the first created_on column and pass a parameter for second. Do the same for both instead?

'NOW()' is just a noisy, misleading way to spell the constant 'now', btw. I guess you want to actually use the function now() (transaction timestamp) instead, which can be subtly different in certain situations. See:

Difference between now() and current_timestamp

Might be. It's mentioned in the link. Typically, the transaction timestamp makes more sense, though. — Erwin Brandstetter, Dec 04 '19 at 23:41
Thanks @ErwinBrandstetter. Please note that the keyword value is hard coded. I have other tables that will have to also perform the same CTE to use the table1 id. I only need the $ parameters for table2. — Jetro Olowole, Dec 05 '19 at 00:51
@JetroOlowole: Adapted. And added a function as example how to actually pass parameters. — Erwin Brandstetter, Dec 05 '19 at 01:32
@ErwinBrandstetter please I have 3 questions. 1. I have a data schema to validate user input. Will it affect the function as you indicated above? 2. As you noted, I was thinking of using parameter $1 for table1 'now()' and passing same for table2. I was not sure how to go about it that's why I hard-coded it twice. Is it possible in your function code above? 3. Can I use the function inside my code? Or do I need to create a file for the function and import it into my express route code? — Jetro Olowole, Dec 05 '19 at 08:38
How would I use a prepared statement to insert into two tables, returning the ID, as in the OP's question? — Czechnology, Jan 19 '21 at 07:55
@Czechnology: Good question. Please make it an actual *question*, comments are not the place. You can always reference this question or answer for context. And you can drop a comment with a forward link here to get my attention. — Erwin Brandstetter, Jan 19 '21 at 14:33

score 0 · Answer 3 · answered Sep 28 '21 at 16:01

If you are doing this because you are using JPA and designed your entities to inherit a shared base class, then my suggestion would be to rethink your entity definitions.

Consider using Embedded classes rather than inheritance - embedding will give you the shared definition, without having every entity class sharing the same table - each resulting entity will be in its own table.

Being able to work with your table definitions easily in SQL should always be a design consideration.

How to insert into two tables in Postgres using CTE in one SQL query

3 Answers3