pandas.to_sql() not working when method='multi'

Question

I'm using latest versions of python, pandas library, pyodbc driver, MS SQL server management studio and sql alchemy.

I have the following code:

connection_string = f'DRIVER={driver};SERVER={server};DATABASE={database};ENCRYPT=yes;UID={username};PWD={password}'
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
engine = create_engine(connection_url)

df = pd.read_csv('some-file.csv', 
             sep = '|', 
             skiprows = [1], 
             on_bad_lines = 'warn', 
             encoding = 'latin1'
            )

The CSV file has 179,098 rows, with 28 columns. Then I have the following:

df.to_sql('some-file-column', 
      con = engine, 
      if_exists = 'append',
      index = False,
      method = 'multi',
      chunksize = 1000)

When executing this, I get:

(pyodbc.Error) ('07002', '[07002] [Microsoft][ODBC Driver 17 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')

If I delete "chunksize" (i.e, allowing all records to be loaded at the same time) from the parameters, I get the following error:

(pyodbc.ProgrammingError) ('The SQL contains -31528 parameter markers, but 5014744 parameters were supplied', 'HY000')

I get the same error but with different "parameter markers" and "parameters supplied" depending on if I add "chunksize" back and the number I use for the chunks.

More importantly, when not using "method = 'multi'", the database loads perfectly fine, to the last record. Granted, it does so but very slowly, which makes it unusable. Also, it loads fine when I limit the dataframe rows from the CSV using "nrows = 100", for example, but beyond that, it throws an error.

As an alternative, I have tried using just pyodbc:

cursor.fast_executemany = True
cursor.executemany("insertion_string_goes_here", df.values.tolist())

which works perfectly fine.

Help? I haven't found any answer to this issue as to why it happens. I need some of the functionality of sql alchemy (executing .sql scripts with python, which pyodbc doesn't allow AFAIK). That's why I don't simply just stay with pyodbc.

pandas.to_sql() not working when method='multi'

0 Answers0