0

I have a dataframe with 4 million rows, I use to_sql to insert it into an exist table, but it takes almost 2 hours to finish the insert, is there a way to speed up?

I use this method:

import pandas as pd
from sqlalchemy import create_engine, types

def _execute_insert(self, conn, keys, data_iter):
    data = [dict((k,v) for k, v in zip(keys, row)) for row in data_iter]
    conn.execute(self.insert_statement().values(data))
    df.to_sql(raw_table_name[i], con=db, index=False, if_exists='append',chunksize=50000)

Method website

is there a way to speed up?

tobsecret
  • 2,442
  • 15
  • 26
Love_Code
  • 47
  • 1
  • 2
  • 9
  • df.to_sql(..., dtype=dtyp), specific the data type in you data frame will speed up the whole process . https://stackoverflow.com/questions/42727990/speed-up-to-sql-when-writing-pandas-dataframe-to-oracle-database-using-sqlalch – BENY Aug 22 '18 at 22:15
  • I also tried this method, but it still takes more than one hour to insert. – Love_Code Aug 22 '18 at 23:07

0 Answers0