Pandas .to_sql() throwing an error for duplicate entry

Question

I am working on pandas dataframe concept to send the data on mysql server.

I have few .csv files eg. one.csv, two.csv. I am reading csv file in dataframe object and using SQLAlChemry library trying to send the file on mysql server

def db_write(db_config,contact_df):
    IP_ADDR=db_config["ip_addr"]
    PORT_NUMBER=db_config["port_num"]
    USER_NAME=db_config["user_name"]
    PASSWORD=db_config["password"]
    DATABASE_NAME=db_config["database_name"]
    TABLE_NAME=db_config["table_name"]

    engine = create_engine("mysql+pymysql://"+USER_NAME+":"+PASSWORD+"@"+IP_ADDR+"/"+DATABASE_NAME)
    con = engine.connect()

    contact_df.to_sql(con=con, name=TABLE_NAME,if_exists='append', index=False)

I am sending one.csv file successfully on server but while sending two.csv on server I am getting IntegrityErro issue

sqlalchemy.exc.IntegrityError: (pymysql.err.IntegrityError) (1062, "Duplicate entry 'xxxxxxxxxxxx' for key 'db.mobile_number'"

NOTE :- having a schema where id has primary key and mobile_number has unique constraint and while sending second file data it might have chances that we may have some common data which is already availbe on server

Question

How to ignore the duplicate data to be insert in database using .to_sql() method
how to use the concept of insert ignore and replace of mysql in sqlalchemry

What i have tried I have refereed below stacjoverflow question but it didnt worked 3. Pandas to_sql fails on duplicate primary key 4. Appending Pandas dataframe to sqlite table by primary key

Plz help me and thanks in advance.

As far as I can see the `to_sql()` method doesn't have any option check for and skip duplicate keys https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html You could insert it into a temporary table with the `replace` option, and then for each file also create an insert statement into the actual destination table that takes into account for duplicates. — BdR, Mar 14 '22 at 07:56
hi @BdR, any other solution in terms of SQLAlChemry library ?? — mauli mauli, Mar 14 '22 at 11:05

Pandas .to_sql() throwing an error for duplicate entry

0 Answers0