38

Is there a way to structure Pandas groupby and qcut commands to return one column that has nested tiles? Specifically, suppose I have 2 groups of data and I want qcut applied to each group and then return the output to one column. This would be similar to MS SQL Server's ntile() command that allows Partition by().

     A    B  C
0  foo  0.1  1
1  foo  0.5  2
2  foo  1.0  3
3  bar  0.1  1
4  bar  0.5  2
5  bar  1.0  3

In the dataframe above I would like to apply the qcut function to B while partitioning on A to return C.

mhabiger
  • 897
  • 2
  • 8
  • 11

1 Answers1

75
import pandas as pd
df = pd.DataFrame({'A':'foo foo foo bar bar bar'.split(),
                   'B':[0.1, 0.5, 1.0]*2})

df['C'] = df.groupby(['A'])['B'].transform(
                     lambda x: pd.qcut(x, 3, labels=range(1,4)))
print(df)

yields

     A    B  C
0  foo  0.1  1
1  foo  0.5  2
2  foo  1.0  3
3  bar  0.1  1
4  bar  0.5  2
5  bar  1.0  3
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • what if this throws the error as below? `KeyError: 2`, i have only two columns, grouping by one, and binning on other, so getting this error. – ggupta Oct 07 '22 at 21:10