I have the following code and data frame:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [6, 7, 8, 9, 10]})
I want to calculate the 0.25 percentile for the column 'A' and the 0.75 percentile for the column 'B' using np.quantile. I try the following code:
(df.
agg({'A' : lambda x: np.quantile(a=x, q=0.25),
'B' : lambda x: np.quantile(a=x, q=0.75)}))
I obtain the following result:
A B
0 1.0 6.0
1 2.0 7.0
2 3.0 8.0
3 4.0 9.0
4 5.0 10.0
However I was expecting the following result or something similar:
A 2.0
B 9.0
dtype: float64
The problem is that the lambda functions are computing the quantiles for each element of the series, rather than for the series as a whole.
My question is how I can obtain the expected result if a I want to use the agg function from pandas and the quantile function from numpy if I want to pass different parameters to a function using lambda functions.
I already read the posts Python Pandas: Passing Multiple Functions to agg() with Arguments and Specifying arguments to pandas aggregate function but they only work when the data is grouped and not when the data is not grouped.