1

I'm trying to use np.where to count consecutive non-NaN values longer than a certain length as shown below:

e.g. If there are more than 3 consecutive non-NaN values then return True.

Would appreciate any help!

value consecutive
nan False
nan False
1 False
1 False
nan False
4 True
2 True
3 True
nan False
nan False
1 True
3 True
3 True
5 True
kswizzle101_
  • 99
  • 1
  • 4

1 Answers1

3

The idea is to create groups by testing missing values and mapping using Series.map with Series.value_counts to have only rows with non NaNs filtered by inverted mask ~m:

#convert values to numeric
df['value'] = df['value'].astype(float)

m = df['value'].isna()
s = m.cumsum()

N = 3
df['new'] = s.map(s[~m].value_counts()).ge(N) & ~m
print (df)
    value  consecutive    new
0     NaN        False  False
1     NaN        False  False
2     1.0        False  False
3     1.0        False  False
4     NaN        False  False
5     4.0         True   True
6     2.0         True   True
7     3.0         True   True
8     NaN        False  False
9     NaN        False  False
10    1.0         True   True
11    3.0         True   True
12    3.0         True   True
13    5.0         True   True
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252