1

I am currently doing an Association Rules project and this is my first time working with SQL Server. I have a Pandas Dataframe with all the results and want to transfer them to an SQL table.

Although, the dataframe has the shape of (1788020, 4) and when running the code it takes too long and it stops at 500ish rows.

Just in case, this is the code I am using:

cursor2 = conn2.cursor()

cursor2.execute("truncate table APriori_test")

for index, row in dataset.iterrows():
    cursor2.execute("INSERT INTO APriori_test(antecedents,consequents,support,confidence) values (?,?,?,?)",row.antecedents,row.consequents,row.support,row.confidence)

 conn2.commit()

Although, when I insert for example only 1000 rows at a time it runs smoothly with no problems.

How can I automatically set to insert the data in branches of for example 10000 rows each time?

I am open to other suggestions.

Thank you!

Cookye
  • 23
  • 4
  • Do you mean "batches" instead of "branches"? – Charlieface Sep 02 '21 at 09:33
  • Does this answer your question? [How to speed up bulk insert to MS SQL Server using pyodbc](https://stackoverflow.com/questions/29638136/how-to-speed-up-bulk-insert-to-ms-sql-server-using-pyodbc) – Charlieface Sep 02 '21 at 09:34

1 Answers1

0

If you are using pandas you might find useful sqlalchemy + pandas.DataFrame.to_sql. I've never used it with SQL Server but your code should be something like:

import pandas as pd
# you have to import your driver, e.g. import pyodbc
from sqlalchemy import create_engine

# replace with your connection string
engine = create_engine("dialect+driver://username:password@host:port/database")

df = pd.DataFrame({'A': [1,2,3], 'B':[4,5,6]})

df.to_sql('MyTable', con=engine, if_exists='append', index=False)

Alessandro
  • 161
  • 5
  • Thank you for your help! Unfortunately this didn't solve the problem. This takes 40 seconds to insert 1000 rows (and 4 columns), even if I use the chunk parameter of the function. – Cookye Sep 02 '21 at 10:37