Error while querying pandas dataframe using pysqldf function

Question

I am trying to run the below code, but eventually not able to run it in Jupyter notebook.

from IPython.display import display
from sklearn import datasets
import data as data
import numpy as nm
import pandas as pd
import matplotlib.pyplot as plt
import pandasql as ps
from pandasql import *
from sqlalchemy import text
from sqlalchemy import sql
from sqlalchemy import create_engine
path = 'D:/train.csv'
titanic = pd.read_csv(open(path), sep=',')
print(titanic.head())
print(titanic.shape)

print (type(titanic))

globals()['titanic']

pysqldf = lambda q: sqldf(q, globals())

query = 'SELECT * FROM titanic LIMIT 3'
pysqldf(query)

The above code I wrote to query a Pandas dataframe using pandasql library features, but it's giving me the below error, not able to figure out what I am missing.

Log:

AttributeError  Traceback (most recent call last)
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\sqlalchemy\engine\base.py:1410, in Connection.execute(self, statement, parameters, execution_options)
   1409 try:
   1410 meth = statement._execute_on_connection
   1411 except AttributeError as err:

AttributeError: 'str' object has no attribute '_execute_on_connection'

The above exception was the direct cause of the following exception:

ObjectNotExecutableError Traceback (most recent call last)
Cell In[41], line 2
      1 query = 'SELECT * FROM titanic LIMIT 3'
      2 pysqldf(query)

Cell In[38], line 1, in <lambda>(q)
1 pysqldf = lambda q: sqldf(q, globals())

Object Not Executable Error: Not an executable object: 'SELECT * FROM titanic LIMIT 3'

Please help.

Did you make everything way more complicated trying to sort this out? Your code looks very convoluted when it could be simpler. I based my trials on [here](https://www.kaggle.com/code/cosmosankur/sql-for-data-analytics-with-titanic-data) instead of your code block. Using that I was seeing the same error you saw. Searching 'pandasql' and that error in a search engine led to [here](https://stackoverflow.com/a/75464429/8508004). And so in my notebook I ran `%pip install --upgrade 'sqlalchemy<2.0'`, restarted the kernel, and re-ran the examples. Now they worked as written there. — Wayne, Mar 29 '23 at 17:39
Hey @Wayne , Thanks for the reply, but it's still the same error because I have already followed the both your here links earlier, that's why posted here. Did the pip install --upgrade 'sqlalchemy<2.0 as well, but no use. If you can please tell me step by step what need to be done exactly, it'll be really great. — Rajdeep, Mar 30 '23 at 06:24
I already told you how I did it step-by-step. There's nothing else I can add other than linking to where you can try it yourself to see. I'll post that later. It would seem your roll back isn't being connected properly. — Wayne, Mar 30 '23 at 12:13

score 1 · Answer 1 · answered Apr 04 '23 at 11:09

I was having the same problem when running:

import pandas as pd
import  pandasql as ps
df = pd.DataFrame({'col1': [1, 2, 3, 4], 'col2': [10, 20, 30, 40]})
query = "SELECT * FROM  df LIMIT 1"
result = ps.sqldf(query, globals())

So, pip install SQLAlchemy==1.4.46 and restart kernel solved it.

Error while querying pandas dataframe using pysqldf function

1 Answers1

Linked