0

I need to fetch data from a MySQL database into Pandas dataframe using odo library in Python. Odo's documentation only provides information on passing a table name to fetch the data but how do I pass a SQL query string that fetches the required data from the database.

The following code works:

import odo

import pandas as pd

data = odo('mysql+pymysql://username:{0}@localhost/dbname::{1}'.format('password', 'table_name'), pd.DataFrame)

But how do I pass a SQL string instead of a table name. Because I need to join multiple other tables to pull the required data.

user2714753
  • 115
  • 3
  • 11

1 Answers1

0

Passing a string directly to odo is not supported by the module. There are three methods to move the data using the tools listed.

First, create a sql query as a string and read using:

data = pandas.read_sql_query(sql, con, index_col=None,
                             coerce_float=True, params=None,
                             parse_dates=None, chunksize=None)[source]

ref http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.read_sql_query.html#pandas.read_sql_query

Second, utilizing the odo method requires running a query into a dictionary then use the dictionary in the odo(source, destination) structure.

cursor.execute(sql)
results = db.engine.execute(sql)

data = odo(results, pd.DataFrame)

ref pg 30 of https://media.readthedocs.org/pdf/odo/latest/odo.pdf

ref How to execute raw SQL in SQLAlchemy-flask app

ref cursor.fetchall() vs list(cursor) in Python

Last, to increase the speed of execution, consider appending the pandas data frame for each result in results.

result = db.engine.execute(sql).fetchone()

data = pd.DataFrame(index=index, columns=list('AB'))
data = df_.fillna(0) # with 0s rather than NaNs

while result is not None:
    dataappend = pd.DataFrame(result, columns=list('AB'))
    data.append(dataappend)
    result = db.engine.execute(sql).fetchone()

ref https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.append.html

ref Creating an empty Pandas DataFrame, then filling it?

brddawg
  • 434
  • 8
  • 19
  • Thanks but reading database data using pandas is slow compared to ODO. Is there any way I can pass a SQL string to the ODO method? – user2714753 Oct 13 '17 at 01:32
  • ODO method does not query as the documentation points out. In order to utilize the (source, destination) structure you'll need to pass a dictionary as described in my edited answer. – brddawg Oct 13 '17 at 13:20