1

I'm trying to follow the method for inserting a Panda data frame into SQL Server that is mentioned here as it appears to be the fastest way to import lots of rows.

However I am struggling with figuring out the connection parameter. I am not using DSN , I have a server name, a database name, and using trusted connection (i.e. windows login).

import sqlalchemy
import urllib

server = 'MYServer'
db = 'MyDB'

cxn_str = "DRIVER={SQL Server Native Client 11.0};SERVER=" + server +",1433;DATABASE="+db+";Trusted_Connection='Yes'"
#cxn_str = "Trusted_Connection='Yes',Driver='{ODBC Driver 13 for SQL Server}',Server="+server+",Database="+db

params = urllib.parse.quote_plus(cxn_str)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
conn = engine.connect().connection
cursor = conn.cursor()

I'm just not sure what the correct way to specify my connection string is. Any suggestions?

roganjosh
  • 12,594
  • 4
  • 29
  • 46
Reddspark
  • 6,934
  • 9
  • 47
  • 64

2 Answers2

2

I have been working with pandas and SQL server for a while and the fastest way I found to insert a lot of data in a table was in this way:

You can create a temporary CSV using:

df.to_csv('new_file_name.csv', sep=',', encoding='utf-8')

Then use pyobdc and BULK INSERT Transact-SQL:

import pyodbc

conn = pyodbc.connect(DRIVER='{SQL Server}', Server='server_name', Database='Database_name', trusted_connection='yes')
cur = conn.cursor()

cur.execute("""BULK INSERT table_name
               FROM 'C:\\Users\\folders path\\new_file_name.csv'
               WITH
               (
                   CODEPAGE = 'ACP',
                   FIRSTROW = 2,
                   FIELDTERMINATOR = ',',
                   ROWTERMINATOR = '\n'
               )""")
conn.commit()

cur.close()
conn.close()

Then you can delete the file:

import os 
os.remove('new_file_name.csv')

It was a second to charge a lot of data at once into SQL Server. I hope this gives you an idea.

Note: don't forget to have a field for the index. It was my mistake when I started to use this lol.

virtualdvid
  • 2,323
  • 3
  • 14
  • 32
  • 1
    You may be interested in the new `fast_executemany` option described [here](https://stackoverflow.com/a/47057189/2144390). – Gord Thompson Jan 13 '18 at 23:07
  • Yeah I would like to! I will check your link when I have my laptop again. I was even reading the article of this post but I didn't like the part of breaking up the rows into 1000 rows batches mmm, Also I would like to try another solution I found that shows how to create a real temporary CSV in python instead of creating a flat file. Any way I will keep this answer if someone need it for future reference. – virtualdvid Jan 13 '18 at 23:16
1

Connection string parameter values should not be enclosed in quotes so you should use Trusted_Connection=Yes instead of Trusted_Connection='Yes'.

Gord Thompson
  • 116,920
  • 32
  • 215
  • 418