I have a python code to insert dataframe
data created from csv
file to Postgres
. My data are not edited, so I'm checking for errors
when I'm inserting one-by-one line to the database.
Often I get error that my value is too long for type VARCHAR(15)
etc.. It's okay, but when I get the error my data are not inserted at all. I would like to make the code continue
inserting data when error
occurs and not stop totally...
def df2db(conn: psycopg2.extensions.connection, df: pd.DataFrame, table: str):
columns = [col for col in df.columns]
buf = StringIO()
df.to_csv(buf, sep='\t', na_rep='\\N', index=False, header=False)
buf.seek(0)
cursor = conn.cursor()
for row in buf:
row = row.replace("\n", "").split("\t")
row = ["'" + val + "'" for val in row]
try:
cursor.execute(f"INSERT INTO {table} ({','.join(columns)}) VALUES({','.join(row)}) ON CONFLICT DO NOTHING")
except psycopg2.Error:
conn.rollback()
continue # here it continues, but my data are not inserted
cursor.close()
BTW. I have about 20mil records, I can't do expensive processing