1

I have MySQL sever that we can call MySever there are a hundred databases on that server, we can call the databases: database1, database2, database3… database100. Each database has a table called tablepull. What I am trying to do is get each table within each database called ‘tablepull’ and store the table into a python list, the table “tablepull” has the exact same headers for each database and can save as a dataframe in pandas.

Here are the variables being defined:

from sqlalchemy import create_engine
import pandas as pd

UserName = MyUserName
Password = MyPassword
Server = MyServer
Databases = [‘database1’, ‘database2’, ‘database3’,… ‘database100’]
Table = tablepull

I can easily pull one database table in using this syntax:

engine = create_engine(mssql:// UserName: Password@Server/Table?driver=SQL Server Native Client 11.0')
connection = engine.connect()
cnxn = connection.connection
#Read in data into from sql table#
raw_data = pd.read_sql(sql = """SELECT * FROM Table""", con = cnxn)

The goal is to be able to iterate through all of the databases and grab the desired table (tablepull) for each database and save into a list. Any help or guidance would be greatly appreciated.

Thank You

Jake
  • 454
  • 5
  • 26
  • Which [DB-API](https://docs.sqlalchemy.org/en/13/dialects/mssql.html#dialect-mssql) are you using? `pyodbc`, `pymssql`, other? Usually ,it is denoted by the `mssql` keyword in engine connection which you do not have in above redacted code. – Parfait Apr 06 '20 at 22:20

1 Answers1

2

You can connect to multiple databases, provided they are on same server by not specifying database name in connect command.

Now, you can have a list with all database names:

db_list = ['db1', 'db2', ... 'db100']
df_dict = {}

for c, db in enumerate(db_list):
    sql = """SELECT * FROM {}.Table"""
    df_dict[c] = pd.read_sql_query(sql.format(db), engine)        

Finally you will get a dictionary with all dataframes.

This should help.

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
  • Thanks, I was able to connect without specifying the database name. I am getting a syntax error on you last line: df{}.format(c) = pd.read_sql_query(query, engine, params=(db)) – Jake Apr 06 '20 at 21:11
  • Edited my answer. Please check. – Mayank Porwal Apr 06 '20 at 21:17
  • Good answer but you cannot parameterize identifiers like table and column names. Use f-strings: `pd.read_sql_query(f"SELECT * FROM {db}.Table", engine)`. – Parfait Apr 06 '20 at 21:22
  • Thank you both but now my connection is not working I must have been mistaken when I thought the connect was made. What I tried is is: engine = create_engine(mssql:// UserName: Password@Server?driver=SQL Server Native Client 11.0') – Jake Apr 06 '20 at 21:30
  • @Parfait Thanks for pointing that out. Updated my answer. Should get a +1 from your end :P – Mayank Porwal Apr 06 '20 at 21:32
  • @Jake Connection not getting formed is not in the scope of this question. Anyway, follow this [link](https://stackoverflow.com/questions/15750711/connecting-to-sql-server-2012-using-sqlalchemy-and-pyodbc) for the same. – Mayank Porwal Apr 06 '20 at 21:35
  • @Parfait Updated. – Mayank Porwal Apr 06 '20 at 22:08
  • @MayankPorwal don't forget to upvote the questions too :) – Jake May 14 '20 at 16:20