1

So I am trying to count the number of consecutive same values in a dataframe and put that information into a new column in the dataframe, but I want the count to look iterative.

Here is what I have so far:

df = pd.DataFrame(np.random.randint(0,3, size=(15,4)), columns=list('ABCD'))
df['subgroupA'] = (df.A != df.A.shift(1)).cumsum()
dfg = df.groupby(by='subgroupA', as_index=False).apply(lambda grp: len(grp))
dfg.rename(columns={None: 'numConsec'}, inplace=True)
df = df.merge(dfg, how='left', on='subgroupA')
df

Here is the result:

    A  B  C  D  subgroupA  numConsec
0   2  1  1  1          1          1
1   1  2  1  0          2          2
2   1  0  2  1          2          2
3   0  1  2  0          3          1
4   1  0  0  1          4          1
5   0  2  2  1          5          2
6   0  2  1  1          5          2
7   1  0  0  1          6          1
8   0  2  0  0          7          4
9   0  0  0  2          7          4
10  0  2  1  1          7          4
11  0  2  2  0          7          4
12  1  2  0  1          8          1
13  0  1  1  0          9          1
14  1  1  1  0         10          1

The problem is, in the numConsec column, I don't want the full count for every row. I want it to reflect how it looks as you iteratively look at the dataframe. The problem is, my dataframe is too large to iteratively loop through and make the counts, as that would be too slow. I need to do it in a pythonic way and make it look like this:

    A  B  C  D  subgroupA  numConsec
0   2  1  1  1          1          1
1   1  2  1  0          2          1
2   1  0  2  1          2          2
3   0  1  2  0          3          1
4   1  0  0  1          4          1
5   0  2  2  1          5          1
6   0  2  1  1          5          2
7   1  0  0  1          6          1
8   0  2  0  0          7          1
9   0  0  0  2          7          2
10  0  2  1  1          7          3
11  0  2  2  0          7          4
12  1  2  0  1          8          1
13  0  1  1  0          9          1
14  1  1  1  0         10          1

Any ideas?

0 Answers0