1

I've installed sql-alchemy Access so that I'm able to use pandas and pyodbc to query my Access DB's.

The issue is, it's incredibly slow because it does single row inserts. Another post suggested I use method='multi' and while it seems to work for whoever asked that question, it throws a CompileError for me.

CompileError: The 'access' dialect with current database version settings does not support in-place multirow inserts.

AttributeError: 'CompileError' object has no attribute 'orig'

import pandas as pd
import pyodbc
import urllib
from sqlalchemy import create_engine
connection_string = (
    r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
    rf"DBQ={accessDB};"
    r"ExtendedAnsiSQL=1;"
)

connection_uri = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
engine = create_engine(connection_uri)
conn = engine.connect()

# Read in tableau SuperStore data
dfSS = pd.read_excel(ssData)

dfSS.to_sql('SuperStore', conn, index=False, method='multi')
Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
Mofongo
  • 131
  • 8
  • https://github.com/gordthompson/sqlalchemy-access/wiki/%5Bpandas%5D-.to_sql%28%E2%80%A6-%2C-method%3D%22multi%22%29-does-not-work – Gord Thompson Feb 04 '22 at 01:04

1 Answers1

3

Access SQL doesn't support multi-row inserts, so a to_sql will never be able to support them as well. That other post is probably using SQLite.

Instead, you can write the data frame to CSV, and insert the CSV by using a query.

Or, of course, not read the Excel in Python at all, but just insert the Excel file by query. This will always be much faster as Access can directly read the data instead of Python reading it and then transmitting it.

E.g.

INSERT INTO SuperStore 
SELECT * FROM [Sheet1$] IN "C:\Path\To\File.xlsx"'Excel 12.0 Macro;HDR=Yes'

You should be able to execute this using pyodbc without needing to involve sqlalchemy. Do note the double and single quote combination, they can be a bit painful when embedding them in other programming languages.

Erik A
  • 31,639
  • 12
  • 42
  • 67
  • Wow thank you so much for this! I'm going to give it a shot now and implement it into my script. – Mofongo Feb 03 '22 at 20:25
  • 1
    Related: https://github.com/gordthompson/sqlalchemy-access/wiki/%5Bpandas%5D-faster-alternative-to-.to_sql()-for-large-uploads – Gord Thompson Feb 03 '22 at 21:07
  • Hi Erik. Just a note that this particular INSERT INTO technique appears to work with OLEDB but not with ODBC (and hence not with pyodbc). I just verified that using VBScript and ADODB. – Gord Thompson Feb 04 '22 at 14:15
  • I have updated the wiki entry to use ADODB instead of COM Automation since the former will work without a full install of Access. Thanks for the tip! – Gord Thompson Feb 04 '22 at 16:01
  • @GordThompson I'm not that well versed in Python, but got it to work in R over an ODBC connection just fine. Sometimes adding `IMEX=1;` after `HDR=Yes` helps if the driver has trouble deducing types. I've noted trouble if the database is open, though. – Erik A Feb 05 '22 at 08:52