flagging observations if (i+1) is larger than (i) for all i in a window of 4 (previous readings)

Question

I have rainfall time series like:

I wanted to use python pandas to Flag an observation that has its previous 4 readings meeting this condition: for each i in range (len(observations))==> i+1>i

The expected output would be something like this:

rainfall    Flag test
0   3.1 F
1   2   F
2   0   F
3   0   F
4   12  F
5   0   F
6   1   F
7   2   F
8   3   T
9   6   T
10  1   F
11  2   F
12  9   F

where it is returning T only for 9th row where previous 3 had this condition.

I was wondering if somebody could help me.

rainfall Flag test 0 3.1 F 1 2 F 2 0 F 3 0 F 4 12 F 5 0 F 6 1 F 7 2 F 8 3 F 9 6 T 10 1 F 11 2 F 12 9 F — , Jul 13 '18 at 04:56
Thank you, there is only one True, can you explain why, how you get it? Not sure if understand `previous 4 readings` — jezrael, Jul 13 '18 at 05:01
This is just a sample that I have added. The reason that only one is True is because thats the only observation where the previous 3 (6th,7th, and 8th) had the conditon of (i+1 >i) — , Jul 13 '18 at 05:06
And `conditon of (i+1 >i)` means for `(6th,7th, and 8th)` `2>1, 3>2, 6>3` ?Or something else? — jezrael, Jul 13 '18 at 05:20
yes exactly, it means the observation kept increasing for more than 4 observations. — , Jul 13 '18 at 05:21

score 0 · Accepted Answer · answered Jul 13 '18 at 05:26

Use strides, then get difference with numpy.diff, compare and last check all Trues per row by numpy.all:

N = 4
x = np.concatenate([[np.nan] * (N-1), df['rainfall'].values])

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
arr = rolling_window(x, N)
print (arr)
[[ nan  nan  nan  3.1]
 [ nan  nan  3.1  2. ]
 [ nan  3.1  2.   0. ]
 [ 3.1  2.   0.   0. ]
 [ 2.   0.   0.  12. ]
 [ 0.   0.  12.   0. ]
 [ 0.  12.   0.   1. ]
 [12.   0.   1.   2. ]
 [ 0.   1.   2.   3. ]
 [ 1.   2.   3.   6. ]
 [ 2.   3.   6.   1. ]
 [ 3.   6.   1.   2. ]
 [ 6.   1.   2.   9. ]]

df['flag'] = (np.diff(arr, axis=1) > 0).all(axis=1)
print (df)
    rainfall   flag
0        3.1  False
1        2.0  False
2        0.0  False
3        0.0  False
4       12.0  False
5        0.0  False
6        1.0  False
7        2.0  False
8        3.0   True
9        6.0   True
10       1.0  False
11       2.0  False
12       9.0  False

flagging observations if (i+1) is larger than (i) for all i in a window of 4 (previous readings)

1 Answers1