2

Say I have a dataframe

df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'],
                   'colB' : ['DEF', 'MNO', 'VWX', '456'],
                   'colC' : ['GHI', 'PQR', 'YZ', '789'],}, index = [0,0,1,1])
   colA colB colC
0  ABC   DEF  GHI 
0  JKL   MNO  PQR
1  STU   VWX   YZ
1  123   456  789

Its guranteed that every pair will have the same index, so we would like the end result to be :

     colA        colB       colC
0  ABC_JKL_0   DEF_MNO_0  GHI_PQR_0 
1  STU_123_1   VWX_456_1   YZ_789_1

where the suffix _number is the index of that group.

I tried doing this by iterating through rows but that's taking a lot of time. I was thinking of something like .groupby(level=0) but can't figure out the next aggregation apply part

Himanshu Poddar
  • 7,112
  • 10
  • 47
  • 93

3 Answers3

3

IIUC, you can try something like this using .agg and a lambda function or you can add it into the dataframe after the groupby:

df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_pQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

Or

df_out=df.groupby(level=0).agg('_'.join)
df_out = df_out.add('_'+df_out.index.to_series().astype(str), axis=0)
print(df_out)
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
2
df.groupby(level=0).agg(lambda x: f"{'_'.join(x)}_{x.index[0]}")

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_PQR_0
1  STU_123_1  VWX_456_1   YZ_789_1
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
2

You can do:

df.groupby(level=0).agg('_'.join).transform(lambda x:x+'_'+str(x.index[0]))
SomeDude
  • 13,876
  • 5
  • 21
  • 44