0

The below code tries to connect to a mssql database using pymssql. I have a CSV file and I am trying to push all the rows into a single data table in the mssql database. I am getting a 'KeyError' when I try to execute the code after opening the CSV file.

import csv
import pymssql
conn = pymssql.connect(host="host name", 
                       database="dbname", 
                       user = "username",
                       password = "password")

cursor = conn.cursor()
if(conn):
    print("True")
else:
    print("False")

with open ('path to csv file', 'r') as f:
    reader = csv.reader(f)
    columns = next(reader) 
    query = "INSERT INTO Marketing({'URL', 'Domain_name', 'Downloadables', 'Text_without_javascript', 'Downloadable_Link'}) VALUES ({%s,%s,%s,%s,%s})"
    query = query.format(','.join('[' + x + ']' for x in columns), ','.join('?' * len(columns)))
    cursor = conn.cursor()
    for data in reader:
        cursor.execute(query, tuple(data))
    cursor.commit()

The below is the error that I get:

KeyError: "'URL', 'Domain_name', 'Downloadables', 'Text_without_javascript', 'Downloadable_Link'"

Using to_sql

file_path = "path to csv"
engine = create_engine("mssql://user:password@host/database") 

df = pd.read_csv(file_path, encoding  = 'latin')
df.to_sql(name='Marketing',con=engine,if_exists='append')

Output:

InterfaceError: (pyodbc.InterfaceError) ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
developer
  • 257
  • 1
  • 3
  • 15
  • 1
    is the error in the python error stack? can you update with more of the error that Python threw? Thanks. – arcee123 Oct 21 '19 at 11:55
  • 1
    my initial thought is you are trying to throw a tuple. Try a list. – arcee123 Oct 21 '19 at 11:57
  • 1
    plus we need to work on your `data in reader` for loop. where are you assigning the values to `query`. I'm not seeing it. – arcee123 Oct 21 '19 at 11:59
  • @arcee123 Thanks for the reply. But this is the only error that was thrown. – developer Oct 21 '19 at 12:00
  • 1
    The query seems a bit unecessarily complex? Try move the query line into the for loop, with query = f"INSERT INTO TABLE ('URL', 'Domain_name', 'Downloadables', 'Text_without_javascript', 'Downloadable_Link') VALUES ({data[0]});" Also if the csv isn't big using pd.to_sql is much easier. – Merelda Oct 21 '19 at 12:08
  • @arcee123 I tried to run a for loop on every row by maintaining the number of rows and executing the insert query, but that didn't help either – developer Oct 21 '19 at 12:08
  • @Merelda, Sure, I will remember that. My csv is not so huge, and I tried the to_sql but it throws an error- InterfaceError: (pyodbc.InterfaceError) ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)'). I have updated my code to show how I have done that – developer Oct 21 '19 at 12:34
  • 1
    For pyodbc, you will need the driver param as described by the wiki https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-SQL-Server-from-Windows. Did the query string above help? – Merelda Oct 21 '19 at 12:48
  • 1
    using this as your reference: https://stackoverflow.com/questions/21257899/writing-a-csv-file-into-sql-server-database-using-python, remove `tuple(data)`, and leave `data`. See if this helps. – arcee123 Oct 21 '19 at 12:54
  • @arcee123 I actually tried it, but that did not help. I actually was able to fix it in a different way. I will update it. – developer Oct 22 '19 at 05:01
  • @Merelda Pyodbc didn't work for me, which is the reason I have been avoiding its usage. – developer Oct 22 '19 at 09:08

1 Answers1

0

I tried everything, from converting the parameters which were being passed to a tuple, passing it as is, but didn't help. Below is the code that helped me fix the issue:

with open ('path to csv file', 'r') as f:
    for row in f:
        reader = csv.reader(f)
#        print(reader)
        columns = next(reader) 
#        print(columns)
        cursor = conn.cursor()
        for data in reader:
#            print(data)
            data = tuple(data)
#            print(data)
            query = ("INSERT INTO Marketing(URL, Domain_name, Downloadables, Text_without_javascript, Downloadable_Link) VALUES (%s,%s,%s,%s,%s)")
            parameters = data
#            query = query.format(','.join('?' * len(columns)))
            cursor.execute(query, parameters)
        conn.commit()

Note: The connecting to the database part remains as in the question.

developer
  • 257
  • 1
  • 3
  • 15