12

I'm following the SQLAlchemy documentation here to write a bulk upsert statement with Postgres. For demonstration purposes, I have a simple table MyTable:

class MyTable(base):
    __tablename__ = 'mytable'
    id = Column(types.Integer, primary_key=True)
    test_value = Column(types.Text)

Creating a generic insert statement is simple enough:

from sqlalchemy.dialects import postgresql

values = [{'id': 0, 'test_value': 'a'}, {'id': 1, 'test_value': 'b'}]
insert_stmt = postgresql.insert(MyTable.__table__).values(values)

The problem I run into is when I try to add the "on conflict" part of the upsert.

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

Trying to execute this statement yields a ProgrammingError:

from sqlalchemy import create_engine
engine = create_engine('postgres://localhost/db_name')

engine.execute(update_stmt)

>>> ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'

I think my misunderstanding is in constructing the statement with the on_conflict_do_update method. Does anyone know how to construct this statement? I have looked at other questions on StackOverflow (eg. here) but I can't seem to a way to address the above error.

rahul
  • 552
  • 6
  • 11
user144153
  • 829
  • 1
  • 12
  • 28

1 Answers1

32
update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

index_elements should either be a list of strings or a list of column objects. So either [MyTable.id] or ['id'] (This is correct)

set_ should be a dictionary with column names as keys and valid sql update objects as values. You can reference values from the insert block using the excluded attribute. So to get the result you are hoping for here you would want set_={'test_value': insert_stmt.excluded.test_value} (The error you made is that data= in the example isn't a magic argument... it was the name of the column on their example table)

So, the whole thing would be

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_={'test_value': insert_stmt.excluded.test_value}
)

Of course, in a real world example I usually want to change more then one column. In that case I would do something like...

update_columns = {col.name: col for col in insert_stmt.excluded if col.name not in ('id', 'datetime_created')}
update_statement = insert_stmt.on_conflict_do_update(index_elements=['id'], set_=update_columns)

(This example would overwrite every column except for the id and datetime_created columns)

Paul Becotte
  • 9,767
  • 3
  • 34
  • 42
  • 2
    Great answer - super clear and easy to follow. Thanks! – user144153 Mar 27 '19 at 06:04
  • 4
    This optimised my horrendous alpha code from 90 seconds to 9 seconds. Someone once told me "never execute SQL statements in a loop" and it stuck with me. I knew I had to find a better way, so - cheers! – deed02392 Sep 06 '19 at 12:16
  • I keep getting `AttributeError: 'Insert' object has no attribute 'excluded'` when i'm trying to compose the update_columns. My insert statement looks like this: `insert_statement = insert(table).values(items)` where table is of type DeclarativeMeta and items is a list of dicts. Any advice would be appreciated. – Mike Sep 09 '21 at 09:12
  • 5
    i figured it out... for anyone with this problem, pay big attention to where you import the insert from. it should be `from sqlalchemy.dialects.postgresql import insert` not `from sqlalchemy import insert`. god. – Mike Sep 09 '21 at 09:42
  • 1
    Yes, for database specific constructs like this, you have to import from the dialect directly. – Paul Becotte Sep 10 '21 at 11:45
  • clear and easy. tks – Yang Apr 29 '23 at 11:15