Pandas : Concat rows of a dataframe with same index to form custom string in pairs

Question

Say I have a dataframe

df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'],
                   'colB' : ['DEF', 'MNO', 'VWX', '456'],
                   'colC' : ['GHI', 'PQR', 'YZ', '789'],}, index = [0,0,1,1])

   colA colB colC
0  ABC   DEF  GHI 
0  JKL   MNO  PQR
1  STU   VWX   YZ
1  123   456  789

Its guranteed that every pair will have the same index, so we would like the end result to be :

     colA        colB       colC
0  ABC_JKL_0   DEF_MNO_0  GHI_PQR_0 
1  STU_123_1   VWX_456_1   YZ_789_1

where the suffix _number is the index of that group.

I tried doing this by iterating through rows but that's taking a lot of time. I was thinking of something like .groupby(level=0) but can't figure out the next aggregation apply part

`df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))` — Scott Boston, Jul 28 '22 at 17:40
@ScottBoston Is it possible to apply multiple function on aggregate? like say first list then tuple, obviously no one would want that but just thinking if multiple function could be applied or not — Himanshu Poddar, Jul 28 '22 at 17:40

score 3 · Accepted Answer · answered Jul 28 '22 at 17:42

IIUC, you can try something like this using .agg and a lambda function or you can add it into the dataframe after the groupby:

df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_pQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

Or

df_out=df.groupby(level=0).agg('_'.join)
df_out = df_out.add('_'+df_out.index.to_series().astype(str), axis=0)
print(df_out)

score 2 · Answer 2 · answered Jul 28 '22 at 17:42

2

df.groupby(level=0).agg(lambda x: f"{'_'.join(x)}_{x.index[0]}")

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_PQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

answered Jul 28 '22 at 17:42

BeRT2me

12,699
2
13
31

score 2 · Answer 3 · answered Jul 28 '22 at 17:42

2

You can do:

df.groupby(level=0).agg('_'.join).transform(lambda x:x+'_'+str(x.index[0]))

answered Jul 28 '22 at 17:42

SomeDude

13,876
5
21
44

Pandas : Concat rows of a dataframe with same index to form custom string in pairs

3 Answers3