Numpy variable slice size (possibly zero)

Question

Lets say I've got some time series data:

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
x = np.linspace(0, 10, num=100)
time_series = np.sin(x) + np.random.random(100)
plt.plot(x, time_series)

If I want to 'delay' the time series by some amount, I can do this:

delay = 10
x_delayed = x[delay:]
time_series_delayed = time_series[:-delay]

plt.plot(x, time_series, label='original')
plt.plot(x_delayed, time_series_delayed, label='delayed')
plt.legend()

This is all well and good, but I want to keep the code clean while still allowing delay to be zero. As it stands, I get an error because the slice my_arr[:-0] just evaluates to my_arr[:0] which will always be the empty slice, instead of the full array.

>>> time_series[:-0]
array([], dtype=float64)

This means that if I want to encode the idea that a delay of zero is identical to the original array, I have to special case every single time I use the slice. This is tedious and error prone:

# Make 3 plots, for negative, zero, and positive delays
for delay in (0, 5, -5):

    if delay > 0:
        x_delayed = x[delay:]
        time_series_delayed = time_series[:-delay]

    elif delay < 0:
        # Negative delay is the complement of positive delay
        x_delayed = x[:delay]
        time_series_delayed = time_series[-delay:]

    else:
        # Zero delay just copies the array
        x_delayed = x[:]
        time_series_delayed = time_series[:]
    # Add the delayed time series to the plot
    plt.plot(
        x_delayed, 
        time_series_delayed, 
        label=f'delay={delay}',
        # change the alpha to make things less cluttered
        alpha=1 if delay == 0 else 0.3
    )
plt.legend()

I've had a look at the numpy slicing object and np._s, but I can't seem to figure it out.

Is there a neat/pythonic way of encoding the idea that a delay of zero is the original array?

`my_arr[:-delay or len(my_arr)]` works but idk how neat it is! — slothrop, May 07 '23 at 11:38
+1 for the hack! that's pretty neat, but not super obvious. (and I assume you meant `my_arr[:-delay or len(my_arr)]`?) Could you post it as an answer and I'll select it unless something more explicit comes around? — beyarkay, May 07 '23 at 11:41

score 1 · Answer 1 · answered May 07 '23 at 11:49

1

I don't know if this is as neat as one might like, but you can make use of the way Python treates truthiness and falsiness, so that i or x is equal to x if i is 0, but i if i is any other integer.

So you could replace the various branches of your conditional with just:

time_series_delayed = time_series[:-delay or len(time_series)]

When delay is 0, this evaluates to time_series[:len(time_series)] which is the same as time_series itself.

As a quick demonstration:

time_series = list(range(10))

def f(i):
    return time_series[:-i or len(time_series)]

print(time_series)
for n in (2, 1, 0):
    print(f"{n}: {f(n)}")

prints:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2: [0, 1, 2, 3, 4, 5, 6, 7]
1: [0, 1, 2, 3, 4, 5, 6, 7, 8]
0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

answered May 07 '23 at 11:49

slothrop

3,218
1
18
11

Thanks! As mentioned in the comment, I'll wait a bit to see if there's a fancy numpy trick to get around this (surely I can't be the only one with this problem?) but will accept your answer if nothing better seems to come up. – beyarkay May 07 '23 at 11:57
Ahh sorry, I just realised that this doesn't work for negative delays, so I can't award this as the answer. A delay of -2 on the list `[0,1,2,3,4,5,6]` should result in `[0,1,2,3,4]` and a delay of +2 should result in `[2,3,4,5,6]` – beyarkay May 07 '23 at 17:10
1

Ah I see - I can't think of a way of achieving that without an `if` statement and separate behaviour based on the sign. There are some recipes here that work with both signs - https://stackoverflow.com/questions/30399534/shift-elements-in-a-numpy-array - but (a) generally they have an `if` statement under the hood, (b) rather than reducing the length of the array, they leave NaN behind. (Easy enough to remove those NaNs of course though.) – slothrop May 07 '23 at 17:14

beyarkay · Accepted Answer · 2023-05-15T12:07:55.960

The solution I went with uses the fact that my_arr[2:] is equivalent to my_arr[2:None]:

arr[(d if d > 0 else None):(d if d < 0 else None)]

A bit more readable:

arr = [0, 1, 2, 3, 4, 5]
delay = 3

start_delay = delay if delay > 0 else None
finish_delay = delay if delay < 0 else None

delayed_arr = arr[start_delay:finish_delay]

Wrapped up in a nice method and with some assertions to show it works:

def delay_array(array, delay):
    """Delays the values in `array` by the amount `delay`.

    Regular slicing struggles with this since negative slicing (which goes from
    the end of the array) and positive slicing (going from the front of the
    array) meet at zero and don't play nicely.

    We use the fact that Python's slicing syntax treats `None` as though it
    didn't exist, so `arr[2:]` is equivalent to `arr[2:None]`.

    This can be used on numpy arrays, but also works on native python lists.
    """
    start_index = delay if delay > 0 else None
    finish_index = delay if delay < 0 else None
    return array[start_index:finish_index]

arr = [0, 1, 2, 3, 4, 5]
# Zero delay results in the same array
assert delay_array(arr,  0) == [0, 1, 2, 3, 4, 5]

# Delay greater/less than zero removes `delay` elements from the front/back
# of the array
assert delay_array(arr, +3) == [         3, 4, 5]
assert delay_array(arr, -3) == [0, 1, 2,        ]

# A delay longer than the array results in an empty array
assert delay_array(arr, +6) == []
assert delay_array(arr, -6) == []

And to cap it all off:

def delay_array(array, delay):
    start_index = delay if delay > 0 else None
    finish_index = delay if delay < 0 else None
    return array[start_index:finish_index]

np.random.seed(42)
x = np.linspace(0, 10, num=100)
time_series = np.sin(x) + np.random.random(100)

for delay in (0, 5, -5):
    x_delayed = delay_array(x, delay)
    time_series_delayed = delay_array(time_series, -delay)
    plt.plot(
        x_delayed, 
        time_series_delayed, 
        label=f'delay={delay}',
        alpha=1 if delay == 0 else 0.3
    )
plt.legend()

Numpy variable slice size (possibly zero)

2 Answers2