Handling errors when inserting data to Postgres SQL

Question

I have a python code to insert dataframe data created from csv file to Postgres. My data are not edited, so I'm checking for errors when I'm inserting one-by-one line to the database.

Often I get error that my value is too long for type VARCHAR(15) etc.. It's okay, but when I get the error my data are not inserted at all. I would like to make the code continue inserting data when error occurs and not stop totally...


def df2db(conn: psycopg2.extensions.connection, df: pd.DataFrame, table: str):
    columns = [col for col in df.columns]
    buf = StringIO()
    df.to_csv(buf, sep='\t', na_rep='\\N', index=False, header=False)
    buf.seek(0)
    cursor = conn.cursor()
    for row in buf:
        row = row.replace("\n", "").split("\t")
        row = ["'" + val + "'" for val in row]
        try:
            cursor.execute(f"INSERT INTO {table} ({','.join(columns)}) VALUES({','.join(row)}) ON CONFLICT DO NOTHING")
        except psycopg2.Error:
            conn.rollback()
            continue # here it continues, but my data are not inserted
    cursor.close()

BTW. I have about 20mil records, I can't do expensive processing

Kuba hasn't forgotten Monica · Answer 1 · 2020-12-25T00:34:37.210

0

No “expensive” processing needed. Just cut the strings to their field lengths according to the database schema - either on Python side, or using functions in the query.

But I’d do it differently: use pg tool or pgAdmin or the COPY sql statement to load the CSV as-is into a temporary table, since that will be very fast, and then do a query that copies the data over, cutting strings to maximum length.

See this q&a for more details.

edited Dec 25 '20 at 00:34

answered Dec 25 '20 at 00:30

Kuba hasn't forgotten Monica

95,931
16
151
313

1

thanks for the answer, I'm inserting 4 csv files with 30 columns each, I think it would be very verbose to write max length for 120 columns in python. And I'm not inserting from localhost, so the second approach would need `paramiko`? or so? – Marcel Kopera Dec 25 '20 at 00:34

Handling errors when inserting data to Postgres SQL

1 Answers1