Questions tagged [pandasql]

pandasql allows you to query pandas DataFrames using SQL syntax. It works similarly to sqldf in R.

235 questions
53
votes
5 answers

Python Pandas to_sql, how to create a table with a primary key?

I would like to create a MySQL table with Pandas' to_sql function which has a primary key (it is usually kind of good to have a primary key in a mysql table) as so: group_export.to_sql(con = db, name = config.table_group_export, if_exists =…
patapouf_ai
  • 17,605
  • 13
  • 92
  • 132
16
votes
2 answers

Group by without an aggregate function

I've seen a pandasql query like this: df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]}) sqldf('select * from df group by A', locals()) This gives: A B 0 1 3 1 2 6 I find it really weird to have a group by without an aggregate function,…
zoran119
  • 10,657
  • 12
  • 46
  • 88
13
votes
1 answer

AttributeError: 'str' object has no attribute '_execute_on_connection'

I have a problem with following code: from pandasql import sqldf import pandas as pd df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]}) query = "SELECT * FROM df WHERE column1 > 1" new_dataframe =…
ciavam
  • 133
  • 1
  • 6
11
votes
1 answer

How to create a dataframe with date range as values in a column?

I have three variables csiti - 23454 : (integer) units - [ 11,22,33,44,55,66,77] : (integer list which is of specific length 'n' always ) begin_date - '2019-10-16' : (string) How do I create a dataframe from this data like csiti units…
Underoos
  • 4,708
  • 8
  • 42
  • 85
9
votes
1 answer

Pandas Merge two rows into a single row based on columns

I have 2 rows that look like these, ------------------------------ DealName | Target | Acquirer | ----------------------------- ABC-XYZ | ABC | None | ------------------------------ ABC-XYZ | None | XYZ …
CodeSsscala
  • 729
  • 3
  • 11
  • 23
9
votes
1 answer

querying panda df to filter rows where a column is not Nan

I am new to python and using pandas. I want to query a dataframe and filter the rows where one of the columns is not NaN. I have tried: a=dictionarydf.label.isnull() but a is populated with true or false. Tried this…
DileepGogula
  • 331
  • 5
  • 20
8
votes
2 answers

Convert unique numbers to md5 hash using pandas

Good morning, All. I want to convert my social security numbers to a md5 hash hex number. The outcome should be a unique md5 hash hex number for each social security number. My data format is as follows: ob =…
Dave
  • 6,968
  • 7
  • 26
  • 32
7
votes
2 answers

PandaSQL very slow

I'm currently switching from R to Python (anconda/Spyder Python 3) for data analysis purposes. In R I used to use a lot R sqldf. Since I'm good at sql queries, I didn't want to re-learn data.table syntax. Using R sqldf, I never had performance…
7
votes
2 answers

Merging DataFrames on multiple conditions - not specifically on equal values

Firstly, sorry if this is a bit lengthy, but I wanted to fully describe what I have having problems with and what I have tried already. I am trying to join (merge) together two dataframe objects on multiple conditions. I know how to do this if the…
5
votes
2 answers

Run sql query on pandas dataframe

I have a dataframe df ID Price Region 1 23 AUS 1 45 DXB 2 25 GER 2 18 TUN I want to write a code in python to get the following output ID Price Region 1 45 DXB 2 25 TUN I have tried using pandasql to get the output…
techie123
  • 55
  • 1
  • 4
5
votes
1 answer

Update table statement not working in pandasql

I have a dataset with some null values I wanted to update to read missing instead of null I've looked through all the forums and documentation and can't seem to find a response from pandasql import sqldf pysqldf = lambda q: sqldf(q,…
willepi
  • 127
  • 1
  • 11
5
votes
1 answer

How to group by and count number of none missing values for each column in group in pandas

I have the following datadrame user_id var qualified_date loyal_date 1 1 2017-01-17 2017-02-03 2 1 2017-01-03 2017-01-13 3 1 2017-01-11 NaT 4 1 NaT NaT 5 1 …
chessosapiens
  • 3,159
  • 10
  • 36
  • 58
5
votes
1 answer

about pandasql locals() and globals() method issue

For sqldf method of pandasql package, there is a "session/environment variables", could be locals() or globals(), could anyone let me know what it is for? And any document reference when should we use locals(), and when should we use…
Lin Ma
  • 9,739
  • 32
  • 105
  • 175
4
votes
1 answer

ObjectNotExecutableError when running Pandasql query in Google Colab

I have a pandas dataframe called df. Doing a standard SQL query using pandasql. However, it gives me the following error: ObjectNotExecutableError: Not an executable object: 'SELECT * FROM df' from pandasql import sqldf q1 = """SELECT * FROM…
Katsu
  • 8,479
  • 3
  • 15
  • 16
4
votes
1 answer

How to keep date format when using pandasql?

In order to join two dataframes df_A and df_B where a date of df_A should lie between a date range of dataframe df_B I am using pandasql. However i noticed that pandasql does not return a proper date (datetime64) as in the original dataframe but a…
Beerbrewer
  • 41
  • 2
1
2 3
15 16