2

I have found that despite only supporting sqlite , the pandas.io.sql.read_sql method work well to create a dataframe by query a Microsoft Access database.

driver_string="DRIVER={Microsoft Access Driver (*.mdb)}"
file_string = "DBQ="+"test.mdb"
connection_string = ";".join([driver_string, file_string])
cnxn = pyodbc.connect(connection_string)
cursor = cnxn.cursor()
df = pd.read_sql("SELECT * FROM testTable", con=cnxn, index_col = ["indexCol"])

I am more interested in reading data and will not likely write back to the database. I've also tested this with more complex queries that include joins and it seems to work well.

Can anyone tell me if there is a danger here? This seems a lot simpler than using SQLAlchemy (which also doesn't support Access).

I would appreciate any insight anybody has.

DSM
  • 342,061
  • 65
  • 592
  • 494
user2111827
  • 79
  • 1
  • 6

1 Answers1

3

To be clear, read_sql only officially supports sqlite when using a DBAPI connection object, when using an sqlalchemy engine, read_sql supports all databases for which you can provide an sqlalchemy engine (PostgreSQL, MySQL, SQL Server, Oracle, .. among others, see http://docs.sqlalchemy.org/en/rel_0_9/dialects/index.html).

But it is true that, although only sqlite is officially supported, it also works for other databases. At the moment, the implementation for the sqlite fallback relies only on the availability of the conn.cursor(), cursor.execute() methods, and a cursor.fetchmany() or cursor.fetchall() method.
So as long as your connection object provides these (and if it is DBAPI2 compliant, it will), this will work.

Although this will most probably not change in the future, it is not guaranteed (it is only tested for sqlite3). Note that there is a sqlalchemy-access project (https://bitbucket.org/zzzeek/sqlalchemy-access), but it seems it has not got much attention lately (see also connecting sqlalchemy to MSAccess).

Above is true for read_sql, writing to an access database with to_sql will certainly not work (this really only works for sqlite3, as the implementation relies on more sqlite3-specific things).

Community
  • 1
  • 1
joris
  • 133,120
  • 36
  • 247
  • 202
  • Thanks a lot. Your explanation of the support was very helpful. I am dealing with retrieving and analyzing data that's stored in a legacy database and understanding the risk of using read_sql was important. It looks like this will work well, particularly if I freeze the pandas version supported by my application. – user2111827 Jan 20 '15 at 20:00