1

I have a table with two columns. In the second column is binary, 0 or 1 value. I would like to keep a running count of these values until it switches. For example, i would like to add a 'count' column that looks like this:

Date          sig   count
2000-01-03    0     1
2000-01-04    0     2
2000-01-05    1     1
2000-01-06    1     2
2000-01-07    1     3
2000-01-08    1     4
2000-01-09    0     1
2000-01-010   0     2
2000-01-011   0     3
2000-01-012   0     4
2000-01-013   0     5

Is there a simple way of doing this with pandas, numpy or simply python without having to iterate or use loops?

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
mike01010
  • 5,226
  • 6
  • 44
  • 77

2 Answers2

2

In numpy you can find an indexes where different groups starts and counts of these groups then apply np.add.accumulate on a sequence of repeated ones with some of them replaced:

def accumulative_count(sig):
    marker_idx = np.flatnonzero(np.diff(sig)) + 1
    counts = np.diff(marker_idx, prepend=0)
    counter = np.ones(len(sig), dtype=int)
    counter[marker_idx] -= counts
    return np.add.accumulate(counter)

df['count'] = accumulative_count[df['sig']]

Sample run:

sig = [0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0]
marker_idx = np.flatnonzero(np.diff(sig)) + 1
counts = np.diff(marker_idx, prepend=0,)
counter = np.ones(len(sig), dtype=int)
counter[marker_idx] -= counts

>>> marker_idx #starts of groups
array([2, 6], dtype=int64)
>>> counts #counts of groups
array([2, 4], dtype=int64)
>>> counter #a sequence of units with some of the units replaced
array([ 1,  1, -1,  1,  1,  1, -3,  1,  1,  1,  1])
>>> np.add.accumulate(counter) #output
array([1, 2, 1, 2, 3, 4, 1, 2, 3, 4, 5], dtype=int32)
mathfux
  • 5,759
  • 1
  • 14
  • 34
1
df['count'] = df.groupby((df['sig'] != df['sig'].shift(1)).cumsum()).cumcount()+1

In [1571]: df
Out[1571]: 
           Date  sig  count
0    2000-01-03    0      1
1    2000-01-04    0      2
2    2000-01-05    1      1
3    2000-01-06    1      2
4    2000-01-07    1      3
5    2000-01-08    1      4
6    2000-01-09    0      1
7   2000-01-010    0      2
8   2000-01-011    0      3
9   2000-01-012    0      4
10  2000-01-013    0      5
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58