0

I have the following data frame

Col1 Col2 Col3 Col4
A01 CY5 1   2.53204725571834
A01 CY5 2   7.40293083315009
A01 CY5 3   0.089914886654128
A01 FAM 1   -132.859675789172
A01 FAM 2   -43.9435529112998
A01 FAM 3   -19.423769786752
H12 TEX 1   -125.964887745247
H12 TEX 2   -89.0624569094034
H12 TEX 3   -66.0548663619256

I have more than thousand such values. Also, I can get rid of third counting column in the output dataframe. All the values are tab separated. I want to transform this data into the following format:

Col1 Col2    Col3
A01   CY5    2.53204725571834 7.40293083315009 0.089914886654128
A01   FAM   -132.859675789172 -43.9435529112998 -19.423769786752
H12   TEX   -125.964887745247 -89.0624569094034 66.0548663619256
Sanwal
  • 307
  • 2
  • 11

1 Answers1

0

Use:

#filter values by `Col3` if necessary
df = df[df['Col3'].lt(4)].copy()

#join values by space
df2 = (df.groupby(['Col1','Col2'])['Col4']
         .agg(lambda x: ' '.join(x.astype(str)))
         .reset_index())

EDIT:

#filter values by `Col3` if necessary
df = df[df['Col3'].lt(4)].copy()

#pivoting
df3 = df.pivot(index=['Col1','Col2'],columns='Col3', values='Col4')

For oldier pandas versions:

df3 = df.set_index(['Col1','Col2', 'Col3'])['col4'].unstack().reset_index()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252