0

I have a dataset that have binary values for flags for each sinid like this:

>>> df = pd.DataFrame({'sinid':['abc','def','ghi','abc','ghi'],'flag1':[1,1,0,0,1],'flag2':[1,0,1,0,0]})
>>> df
  sinid  flag1  flag2
0   abc      1      1
1   def      1      0
2   ghi      0      1
3   abc      0      0
4   ghi      1      0

I want to add values for each sinid, I think I need groupby but not sure how to use it...

This is the expected result:

  sinid  flag1  flag2
0   abc      1      1
1   def      1      0
2   ghi      1      1
Soufiane Sabiri
  • 749
  • 1
  • 5
  • 20

3 Answers3

1

Group by then do a sum and reset the index.

df = df.groupby(['sinid']).sum().reset_index()
df

Result:

  sinid flag1   flag2
0   abc  1      1
1   def  1      0
2   ghi  1      1
jose_bacoy
  • 12,227
  • 1
  • 20
  • 38
0

Just summarize grouped dataframe:

df.groupby('sinid').sum()

    flag1   flag2
sinid       
abc     1   1
def     1   0
ghi     1   1
vurmux
  • 9,420
  • 3
  • 25
  • 45
0

This works:

df.groupby(['sinid'])['flag1', 'flag2'].sum().reset_index()

  sinid  flag1  flag2
0   abc      1      1
1   def      1      0
2   ghi      1      1
Adarsh Chavakula
  • 1,509
  • 19
  • 28