0

I have a dataframe data that contains 13m rows and 8 columns and is 790mb in size. My query below was still running after 45 minutes which seemed like a red flag. I have tried to iterate through each row to insert into my SQL Server table but I am unsure how to efficiently "chunk" the loading to be more efficient. Any help is appreciated!

 cnxn = pyodbc.connect("personal connection info here")
    cursor = cnxn.cursor()
    # Insert Dataframe into SQL Server:
    for index, row in data.iterrows():
        cursor.execute("INSERT INTO daily_log (Date,Tick,[Open],High,Low,[Close],Adj_close,Volume) values(?,?,?,?,?,?,?,?)", 
                       row.Date, row.Tick, row.Open, row.High, row.Low, row.Close, row.Adj_close,row.Volume)
    cnxn.commit()
    cursor.close()
bytebybyte
  • 115
  • 10
  • 1
    Does this answer your question? [How to speed up bulk insert to MS SQL Server using pyodbc](https://stackoverflow.com/questions/29638136/how-to-speed-up-bulk-insert-to-ms-sql-server-using-pyodbc) You want the second answer there https://stackoverflow.com/a/47057189/14868997 – Charlieface Jun 26 '21 at 23:47
  • @Charlieface - i will try this out, would you happen to know if crsr.fast_executemany = True needs to be stated inside the for-loop or outside just before the commit? – bytebybyte Jun 27 '21 at 00:28

0 Answers0