I know it is easy to check how many missing values are in a pandas series. What if I want to check if a Pandas Series has 6+ Continuous Missing Values Entries?
Asked
Active
Viewed 106 times
2 Answers
1
mask = temp_df.loc[:,i].isna()
max_missing_val = temp_df.loc[:,i][mask].groupby((~mask).cumsum()[mask]).agg(['size'])
if len(max_missing_val) == 0:
max_missing_val = 0
else:
max_missing_val = max_missing_val.max()[0]
Reference: Counting continuous nan values in panda Time series

Zhang Yongheng
- 125
- 2
- 10
0
You can make use of cumsum
to create groups of continuous NaN
values:
s = pd.Series(
[np.nan, 1, 2, np.nan, np.nan, np.nan, 3, 4, np.nan, np.nan]*2
)
# create groups of continuous na/non na values
group = s.isna().ne(s.shift().isna()).cumsum()
# set threshold for minimum group size, here 3 instead of 6
threshold = 3
group_size = s.groupby(group).transform('size')
# check for rows with 3+ continous NaN values
print(s[(group % 2 == 0) & (group_size.ge(threshold))])
# output
3 NaN
4 NaN
5 NaN
8 NaN
9 NaN
10 NaN
13 NaN
14 NaN
15 NaN

Anders Källmar
- 366
- 1
- 4