8

I want to "insert ignore" an entire pandas dataframe into mysql. Is there a way to do this without looping over the rows?

In dataframe.to_sql I only see the option if_exists 'append' but will this still continue on duplicate unique keys?

Yogesh Yadav
  • 4,557
  • 6
  • 34
  • 40
user3605780
  • 6,542
  • 13
  • 42
  • 67

2 Answers2

17

Consider using a temp table (with exact structure of final table) that is always replaced by pandas then run the INSERT IGNORE in a cursor call:

dataframe.to_sql('myTempTable', con, if_exists ='replace')

cur = con.cursor()
cur.execute("INSERT IGNORE INTO myFinalTable SELECT * FROM myTempTable")
con.commit()
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • 3
    Replace the last 3 lines like this if you are using sqlalchemy: `connection = engine.connect() result = connection.execute("INSERT IGNORE INTO myFinalTable SELECT * FROM myTempTable") connection.close()` – kristian Dec 17 '17 at 18:20
0

There is no way to do this in pandas till the current version of pandas (0.20.3) .

The option if_exists applies only on table ( not on rows ) as stated in the documentation.

if_exists : {‘fail’, ‘replace’, ‘append’}, default ‘fail’

fail: If table exists, do nothing.

replace: If table exists, drop it, recreate it, and insert data.

append: If table exists, insert data. Create if does not exist.

Via Looping

This will slow down the process as you are inserting one row at a time

for x in xrange(data_frame.shape[0]):
    try:
        data_frame.iloc[x:x+1].to_sql(con=sql_engine, name="table_name", if_exists='append')
    except IntegrityError:
        # Your code to handle duplicates
        pass 
Yogesh Yadav
  • 4,557
  • 6
  • 34
  • 40