I am trying to insert pandas dataframe into a postgres table as fast as possible. I have tried out executemany but it's too slow. It was taking 6 hours for inserting 24000 rows. Most online solutions suggested that psycopg2's copy_from
is a faster method. I am using the following code to insert dataframe into postgres table.
f = io.StringIO()
df.to_csv(f, index=False, header=False)
f.seek(0)
con = psycopg2.connect(**config_dev)
cur = con.cursor()
cur.copy_from(f, 'table', null="", sep=",")
con.commit()
cur.close()
But I am getting this error i.e. psycopg2.DataError: invalid input syntax for integer: "10% off on shoulder bags" CONTEXT: COPY table, line 1, column coupon_id: "10% off on shoulder bags"
. Here the text from the next column is trying to get inserted into coupon_id which is an integer. This is because my first column in dataframe is being inserted to primary_key id in the postgres table which messes up the columns and hence the data error. Does anyone has a solution for it?