0

I have a script that read 500,000 row data from excel file and then insert into MSSQL database;

how can decrease insert time?

my code is like below :

cnxn = pyodbc.connect('DRIVER={SQL Server};Server=DESKTOP\MS2019;\
Database=203-Temp;Port=1433;User ID=sa;Password=1234;TrustServerCertificate=True')
cnxn.autocommit = False
cursor = cnxn.cursor()
data_df = pd.read_excel('a.xlsx').fillna(value=b'0')
data_dict = data_df.to_dict("records")
for row in tqdm(data_dict, desc="Inserting Data Into DataBase"):
     c1= row['col1']
     c2= row['col2']
     c3= row['col3']
     c4= row['col4']
     c5= row['col5']
     c6= row['col6']
     c7= row['col7']
     c8= row['col8']
     c9= row['col9']
     c10= row['col10']

     cursor.execute(
                "INSERT INTO Temp (UID, Name, ShName, GID, GLink,"
                "GName,GAbout,PUID,PUnID,Date)"
                " values (?,?,?,?,?,?,?,?,?,?)", c1, c2, c3, c4, c5, c6, c7, c8, c9, c10)
cursor.commit()

I used method DataFrame.to_dict() to regular datas.

I think use "BULK INSERT " or using method curser.executemany() can help me.

what do you offer me?

how to use this method?

  • See: https://stackoverflow.com/a/452934/2681662 and see transactions: https://stackoverflow.com/a/10153706/2681662 – MSH Apr 16 '23 at 08:21
  • Why don't you just try using `executemany` if you already know about the method? You can just find out for yourself without needing someone else to answer – roganjosh Apr 16 '23 at 08:25
  • You can also checkout OPENROWSET/OPENDATASOURCE, it can read excel files directly (albeit, usually the excel reading has some issues). https://www.mssqltips.com/sqlservertip/6178/read-excel-file-in-sql-server-with-openrowset-or-opendatasource/ – siggemannen Apr 16 '23 at 08:55
  • Aside... `PORT` is not a connection string keyword for Microsoft SQL Server-related drivers. To connect to an instance by name (which requires that the SQL Browser service is accessible, running and up-to-date) use `Server=serverNameOrIpAddressHere\instanceNameHere;`, to use a specific port number use `Server=serverNameOrIpAddressHere,portNumber;` instead. Ref: [Proof of concept connecting to SQL using pyodbc](https://learn.microsoft.com/en-us/sql/connect/python/pyodbc/step-3-proof-of-concept-connecting-to-sql-using-pyodbc) – AlwaysLearning Apr 16 '23 at 08:56
  • replay to @AlwaysLearning : i think you misunderstand my question, my connection is working properly and insert into DataBase without any problems. – JANI-COBIN Apr 16 '23 at 09:36
  • `BULK INSERT` using a CSV might be the fastest option, if a bit unwieldy – Charlieface Apr 16 '23 at 09:44
  • It's called an aside because it's addressing something not related to your question, but something else that's wrong with your code and/or assumptions. – AlwaysLearning Apr 16 '23 at 10:25
  • See also: https://stackoverflow.com/q/29638136/2144390 – Gord Thompson Apr 16 '23 at 11:58

0 Answers0