0

if I have a dataframe:

>>> import pandas as pd
>>> df = pd.DataFrame({'grp':['A', 'A', 'B', 'B', 'B'], 'pos' : [1, 2, 1, 2, 3], 'desc1' : ['X1', 'X2', 'Y1', 'Y2', 'Y3'], 'desc2' : ['A1', 'A2', 'A1', 'A2', 'A3']})
>>> df['desc'] = df.desc1 + ' (' + df.desc2 + ')'
>>> df = df.drop(columns=['desc1', 'desc2'])
>>> df
  grp  pos     desc
0   A    1  X1 (A1)
1   A    2  X2 (A2)
2   B    1  Y1 (A1)
3   B    2  Y2 (A2)
4   B    3  Y3 (A3)
>>>

I would like to transform i to the following dataframe:

  grp     pos1     pos2     pos3
0   A  X1 (A1)  X2 (A2)     None
1   B  Y1 (A1)  Y2 (A2)  Y3 (A3)

I would like to group everything by "grp" and for each position to have desc in column cell. Groups have variable number of positions. How to do the transformation?

Regards.

user3225309
  • 1,183
  • 3
  • 15
  • 31
  • check this answer https://stackoverflow.com/questions/19530568/can-pandas-groupby-aggregate-into-a-list-rather-than-sum-mean-etc – shantanu pathak May 15 '20 at 09:35

2 Answers2

2

IIUC pd.crosstab

df1 = pd.crosstab(df.grp,df.pos,df.desc,aggfunc=lambda x : x)\
                .add_prefix('pos')\
                .reset_index()\
                .rename_axis(None,axis=1)

print(df1)
  grp     pos1     pos2     pos3
0   A  X1 (A1)  X2 (A2)      NaN
1   B  Y1 (A1)  Y2 (A2)  Y3 (A3)
Umar.H
  • 22,559
  • 7
  • 39
  • 74
1

Or you could solve it using groupby and unstack:

import pandas as pd
df = pd.DataFrame({'grp':['A', 'A', 'B', 'B', 'B'], 'pos' : [1, 2, 1, 2, 3], 'desc1' : ['X1', 'X2', 'Y1', 'Y2', 'Y3'], 'desc2' : ['A1', 'A2', 'A1', 'A2', 'A3']})
df['desc'] = df.desc1 + ' (' + df.desc2 + ')'
df = df.drop(columns=['desc1', 'desc2'])

df1 = df.groupby(['grp', 'pos'])['desc'].first().unstack('pos')
print(df1)
#Output:
pos        1        2        3
grp                           
A    X1 (A1)  X2 (A2)      NaN
B    Y1 (A1)  Y2 (A2)  Y3 (A3)
Hidden_NB
  • 68
  • 6