0

I have rainfall time series like:

rainfall
0   3.1
1   2
2   0
3   0
4   12
5   0
6   1
7   2
8   3
9   6
10  1
11  2
12  9

I wanted to use python pandas to Flag an observation that has its previous 4 readings meeting this condition: for each i in range (len(observations))==> i+1>i

The expected output would be something like this:

rainfall    Flag test
0   3.1 F
1   2   F
2   0   F
3   0   F
4   12  F
5   0   F
6   1   F
7   2   F
8   3   T
9   6   T
10  1   F
11  2   F
12  9   F

where it is returning T only for 9th row where previous 3 had this condition.

I was wondering if somebody could help me.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252

1 Answers1

0

Use strides, then get difference with numpy.diff, compare and last check all Trues per row by numpy.all:

N = 4
x = np.concatenate([[np.nan] * (N-1), df['rainfall'].values])

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
arr = rolling_window(x, N)
print (arr)
[[ nan  nan  nan  3.1]
 [ nan  nan  3.1  2. ]
 [ nan  3.1  2.   0. ]
 [ 3.1  2.   0.   0. ]
 [ 2.   0.   0.  12. ]
 [ 0.   0.  12.   0. ]
 [ 0.  12.   0.   1. ]
 [12.   0.   1.   2. ]
 [ 0.   1.   2.   3. ]
 [ 1.   2.   3.   6. ]
 [ 2.   3.   6.   1. ]
 [ 3.   6.   1.   2. ]
 [ 6.   1.   2.   9. ]]

df['flag'] = (np.diff(arr, axis=1) > 0).all(axis=1)
print (df)
    rainfall   flag
0        3.1  False
1        2.0  False
2        0.0  False
3        0.0  False
4       12.0  False
5        0.0  False
6        1.0  False
7        2.0  False
8        3.0   True
9        6.0   True
10       1.0  False
11       2.0  False
12       9.0  False
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252