0

According to psycopg2: insert multiple rows with one query, it is much more efficient to use psycopg2's execute instead of executemany . Can others confirm?

The above StackOverflow question suggests using mogrify for creating statements of the sort:

INSERT INTO table VALUES (value1, value2), (value3, value4)

Is it possible to generate such a statement using the regular execute function? I thought something of the form

cursor.execute("""INSERT INTO table VALUES (%s, %s), (%s, %s)""", ((value1,value2),(value3,value4)))

would work.

UPDATE:

For instance, I tried I passing into execute the sql statement:

insert into history (timestamp) values (%s),(%s); 

with the folowing tuple:

(('2014-04-27 14:07:30.000000',), ('2014-04-27 14:07:35.000000',))

but all I got back was the error:

no results to fetch

Community
  • 1
  • 1
ChaimKut
  • 2,759
  • 3
  • 38
  • 64
  • You didn't try this first? – Martijn Pieters Apr 27 '14 at 08:44
  • Multi-valued inserts are indeed more efficient than `executemany` in `psycopg2`, though only up to a point - if you're inserting tens of thousands of rows per statement, memory use becomes an issue. Batches of a few hundred or couple of thousand (if the rows are smaller) can be fine. – Craig Ringer Apr 27 '14 at 09:26

1 Answers1

8

To use the execute method place the data to be inserted in a list. A list will be adapted by psycopg2 to an array. Then you unnest the array and cast the values as necessary

import psycopg2

insert = """
    insert into history ("timestamp")
    select value
    from unnest(%s) s(value timestamp)
    returning *
;"""

data = [('2014-04-27 14:07:30.000000',), ('2014-04-27 14:07:35.000000',)]
conn = psycopg2.connect("host=localhost4 port=5432 dbname=cpn")
cursor = conn.cursor()
cursor.execute(insert, (data,))
print cursor.fetchall()
conn.commit()
conn.close()

Not sure if the performance difference from executemany will be significant. But I think the above is neater. The returning clause will, as the name suggests, return the inserted tuples.

BTW timestamp is a reserved word and should not be used as a column name.

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • +1 for 'unnest' and pointing out the 'timestamp' problem, a remnant of using an ORM. – ChaimKut Apr 27 '14 at 15:34
  • (In postgres it looks like 'timestamp' is not reserved. http://www.postgresql.org/docs/9.3/static/sql-keywords-appendix.html ) – ChaimKut Apr 27 '14 at 15:39
  • You don't need the column names in the select. This will work just as well: `INSERT INTO my_table (col1, col2) SELECT * FROM unnest(%s) s(col1 integer, col2 text) RETURNING *;` – pir Sep 20 '19 at 17:55