1

I have a data frame like this:

df
col1    col2
 12       A
 14       A
 22       B
 24       C
 20       A
 18       B
 16       B

Now I want to add the values of col1 on the basis of col2 if col2 value occurs more than one time continuously. The final data frame should look like:

col1    col2
 A       26
 B       22
 C       24
 A       20
 B       34

I can use groupby() but how to differentiate continuous conditions ?

Kallol
  • 2,089
  • 3
  • 18
  • 33

1 Answers1

1

Use GroupBy.agg with helper Series created by Series.ne with Series.shift and Series.cumsum:

s = df['col2'].ne(df['col2'].shift()).cumsum()
df = df.groupby(s).agg({'col2':'first', 'col1':'sum'}).reset_index(drop=True)
print (df)
  col2  col1
0    A    26
1    B    22
2    C    24
3    A    20
4    B    34
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252