25

I want to do

 " on conflict (time) do update set name , description "

but I have no idea when I use stdin with csv , I don't know what name equal what? and description equal what...

table_a:

enter image description here

xxx.csv:

enter image description here

with open('xxx/xxx.csv', 'r', encoding='utf8') as f:
    sql = """
    COPY table_a FROM STDIN With CSV on conflict (time) 
    do update set name=??, description=??;
    """
    cur.copy_expert(sql, f)
    conn.commit()
Frank Liao
  • 855
  • 1
  • 8
  • 25
  • Create a temporary table, COPY your csv there, then INSERT from said temporary to your main table with required conflict resolution. – Ilja Everilä Dec 29 '17 at 08:11

4 Answers4

58

In this SO post, there are two answers that -combined together- provide a nice solution for successfully using ON CONFLICT. The example below, uses ON CONFLICT DO NOTHING;:

BEGIN;
CREATE TEMP TABLE tmp_table 
(LIKE main_table INCLUDING DEFAULTS)
ON COMMIT DROP;
    
COPY tmp_table FROM 'full/file/name/here';
    
INSERT INTO main_table
SELECT *
FROM tmp_table
ON CONFLICT DO NOTHING;
COMMIT;

Replace both instances of main_table with the name of your table.

raratiru
  • 8,748
  • 4
  • 73
  • 113
  • 1
    It is kind of implicit, buy maybe it should be said, that this works only within a transaction. So you need a `BEGIN;` on top and a `COMMIT;` at the end, unless something creates this for you. – Jan Dec 01 '22 at 09:00
24

Thanks for every master's solution.

this is my solution.

sql = """
CREATE TABLE temp_h (
    time ,
    name,
    description
);
COPY temp_h FROM STDIN With CSV;

INSERT INTO table_a(time, name, description)
SELECT *
FROM temp_h ON conflict (time) 
DO update set name=EXCLUDED.name, description=EXCLUDED.description;

DROP TABLE temp_h;
"""
James Douglas
  • 3,328
  • 2
  • 22
  • 43
Frank Liao
  • 855
  • 1
  • 8
  • 25
  • 1
    A minor suggestion: use `CREATE TEMPORARY TABLE temp_h`, so it's automatically dropped at the end of the session (or transaction, if you want and use `ON COMMIT DROP`) *and* it's [unlogged](https://www.postgresql.org/docs/current/static/sql-createtable.html), which [improves performance](https://www.postgresql.org/message-id/23406.1364255041%40sss.pgh.pa.us). – Ilja Everilä Dec 29 '17 at 10:38
  • In [this post](https://stackoverflow.com/a/42217872/2996101), it is argued that identical rows should not be updated without need. – raratiru Apr 16 '18 at 13:10
6

I've managed to accomplish a bulk upsert with the following function (suggestions are welcome):

import io
from sqlalchemy.engine import Engine
from sqlalchemy.ext import declarative_base

BaseModel = declarative_base()


def upsert_bulk(engine: Engine, model: BaseModel, data: io.StringIO) -> None:
    """
    Fast way to upsert multiple entries at once

    :param `db`: DB Session
    :param `data`: CSV in a stream object
    """
    table_name = model.__tablename__
    temp_table_name = f"temp_{table_name}"

    columns = [c.key for c in model.__table__.columns]

    # Select only columns to be updated (in my case, all non-id columns)
    variable_columns = [c for c in columns if c != "id"]

    # Create string with set of columns to be updated
    update_set = ", ".join([f"{v}=EXCLUDED.{v}" for v in variable_columns])

    # Rewind data and prepare it for `copy_from`
    data.seek(0)

    with conn.cursor() as cur:
        # Creates temporary empty table with same columns and types as
        # the final table
        cur.execute(
            f"""
            CREATE TEMPORARY TABLE {temp_table_name} (LIKE {table_name})
            ON COMMIT DROP
            """
        )

        # Copy stream data to the created temporary table in DB
        cur.copy_from(data, temp_table_name)

        # Inserts copied data from the temporary table to the final table
        # updating existing values at each new conflict
        cur.execute(
            f"""
            INSERT INTO {table_name}({', '.join(columns)})
            SELECT * FROM {temp_table_name}
            ON CONFLICT (id) DO UPDATE SET {update_set}
            """
        )

        # Drops temporary table (I believe this step is unnecessary,
        # but tables sizes where growing without any new data modifications
        # if this command isn't executed)
        cur.execute(f"DROP TABLE {temp_table_name}")

        # Commit everything through cursor
        conn.commit()

    conn.close()

2

https://www.postgresql.org/docs/current/static/sql-copy.html

there is no copy ... on conflict do statement in postgres

https://www.postgresql.org/docs/current/static/sql-insert.html

only insert ... on conflict do

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132