pandasql allows you to query pandas DataFrames using SQL syntax. It works similarly to sqldf in R.
Questions tagged [pandasql]
235 questions
53
votes
5 answers
Python Pandas to_sql, how to create a table with a primary key?
I would like to create a MySQL table with Pandas' to_sql function which has a primary key (it is usually kind of good to have a primary key in a mysql table) as so:
group_export.to_sql(con = db, name = config.table_group_export, if_exists =…

patapouf_ai
- 17,605
- 13
- 92
- 132
16
votes
2 answers
Group by without an aggregate function
I've seen a pandasql query like this:
df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
sqldf('select * from df group by A', locals())
This gives:
A B
0 1 3
1 2 6
I find it really weird to have a group by without an aggregate function,…

zoran119
- 10,657
- 12
- 46
- 88
13
votes
1 answer
AttributeError: 'str' object has no attribute '_execute_on_connection'
I have a problem with following code:
from pandasql import sqldf
import pandas as pd
df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]})
query = "SELECT * FROM df WHERE column1 > 1"
new_dataframe =…

ciavam
- 133
- 1
- 6
11
votes
1 answer
How to create a dataframe with date range as values in a column?
I have three variables
csiti - 23454 : (integer)
units - [ 11,22,33,44,55,66,77] : (integer list which is of specific length 'n' always )
begin_date - '2019-10-16' : (string)
How do I create a dataframe from this data like
csiti units…

Underoos
- 4,708
- 8
- 42
- 85
9
votes
1 answer
Pandas Merge two rows into a single row based on columns
I have 2 rows that look like these,
------------------------------
DealName | Target | Acquirer |
-----------------------------
ABC-XYZ | ABC | None |
------------------------------
ABC-XYZ | None | XYZ …

CodeSsscala
- 729
- 3
- 11
- 23
9
votes
1 answer
querying panda df to filter rows where a column is not Nan
I am new to python and using pandas.
I want to query a dataframe and filter the rows where one of the columns is not NaN.
I have tried:
a=dictionarydf.label.isnull()
but a is populated with true or false.
Tried this…

DileepGogula
- 331
- 5
- 20
8
votes
2 answers
Convert unique numbers to md5 hash using pandas
Good morning, All.
I want to convert my social security numbers to a md5 hash hex number. The outcome should be a unique md5 hash hex number for each social security number.
My data format is as follows:
ob =…

Dave
- 6,968
- 7
- 26
- 32
7
votes
2 answers
PandaSQL very slow
I'm currently switching from R to Python (anconda/Spyder Python 3) for data analysis purposes. In R I used to use a lot R sqldf. Since I'm good at sql queries, I didn't want to re-learn data.table syntax. Using R sqldf, I never had performance…

Philippe Stepniewski
- 103
- 1
- 6
7
votes
2 answers
Merging DataFrames on multiple conditions - not specifically on equal values
Firstly, sorry if this is a bit lengthy, but I wanted to fully describe what I have having problems with and what I have tried already.
I am trying to join (merge) together two dataframe objects on multiple conditions. I know how to do this if the…

Tim Vivian-Griffiths
- 131
- 1
- 1
- 6
5
votes
2 answers
Run sql query on pandas dataframe
I have a dataframe df
ID
Price
Region
1
23
AUS
1
45
DXB
2
25
GER
2
18
TUN
I want to write a code in python to get the following output
ID
Price
Region
1
45
DXB
2
25
TUN
I have tried using pandasql to get the output…

techie123
- 55
- 1
- 4
5
votes
1 answer
Update table statement not working in pandasql
I have a dataset with some null values I wanted to update to read missing instead of null
I've looked through all the forums and documentation and can't seem to find a response
from pandasql import sqldf
pysqldf = lambda q: sqldf(q,…

willepi
- 127
- 1
- 11
5
votes
1 answer
How to group by and count number of none missing values for each column in group in pandas
I have the following datadrame
user_id var qualified_date loyal_date
1 1 2017-01-17 2017-02-03
2 1 2017-01-03 2017-01-13
3 1 2017-01-11 NaT
4 1 NaT NaT
5 1 …

chessosapiens
- 3,159
- 10
- 36
- 58
5
votes
1 answer
about pandasql locals() and globals() method issue
For sqldf method of pandasql package, there is a "session/environment variables", could be locals() or globals(), could anyone let me know what it is for? And any document reference when should we use locals(), and when should we use…

Lin Ma
- 9,739
- 32
- 105
- 175
4
votes
1 answer
ObjectNotExecutableError when running Pandasql query in Google Colab
I have a pandas dataframe called df.
Doing a standard SQL query using pandasql. However, it gives me the following error:
ObjectNotExecutableError: Not an executable object: 'SELECT * FROM df'
from pandasql import sqldf
q1 = """SELECT * FROM…

Katsu
- 8,479
- 3
- 15
- 16
4
votes
1 answer
How to keep date format when using pandasql?
In order to join two dataframes df_A and df_B where a date of df_A should lie between a date range of dataframe df_B I am using pandasql. However i noticed that pandasql does not return a proper date (datetime64) as in the original dataframe but a…

Beerbrewer
- 41
- 2