1

I want to find up/down patterns in a time series. This is what I use for simple up/down:

diff = np.diff(source, n=1)
encoding = np.where(diff > 0, 1, 0)

Is there a way with Numpy to do that for patterns with a given lookback length without a slow loop? For example up/up/up = 0 down/down/down = 1 up/down/up = 2 up/down/down = 3.....

Thank you for your help.

myfire
  • 33
  • 6

1 Answers1

1

I learned yesterday about np.lib.stride_tricks.as_strided from one of StackOverflow answers similar to this. This is an awesome trick and not that hard to understand as I expected. Now, if you get it, let's define a function called rolling that lists all the patterns to check with:

def rolling(a, window):
    shape = (a.size - window + 1, window)
    strides = (a.itemsize, a.itemsize)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

compare_with = [True, False, True]
bool_arr = np.random.choice([True, False], size=15)
paterns = rolling(bool_arr, len(compare_with))

And after that you can calculate indexes of pattern matches as discussed here

idx = np.where(np.all(paterns == compare_with, axis=1))

Sample run:

bool_arr
array([ True, False,  True, False,  True,  True, False, False, False,
       False, False, False,  True,  True, False])
patterns
array([[ True, False,  True],
       [False,  True, False],
       [ True, False,  True],
       [False,  True,  True],
       [ True,  True, False],
       [ True, False, False],
       [False, False, False],
       [False, False, False],
       [False, False, False],
       [False, False, False],
       [False, False,  True],
       [False,  True,  True],
       [ True,  True, False]])
idx
(array([ 0,  2, 13], dtype=int64),)
mathfux
  • 5,759
  • 1
  • 14
  • 34
  • 1
    Thank you very much. To be honest strides are hard to understand for me. But its working. – myfire Sep 02 '20 at 18:05
  • 1
    After I have learned it, I understand strides this way. It takes `shape` and `strikes` parameters. `shape` is quite trivial, as you can see, it's `(13, 3)` for my new array formed. And `strides` consists of two numbers, `strides[0]` and `strides[1]`. The first one tells how many cells do we move in `bool_arr` in order to get a value of item that is one row (x-axis) upwards in a new array. The second one is same, just for columns (y-axis). In both cases number of cells is 1 – mathfux Sep 02 '20 at 18:23