How do I quickly decimate a numpy array?

Question

I need a function that decimates, removes m in n of, a numpy array. For example to remove 1 in 2 or remove 2 in 3. So an array which is: [7, 4, 3, 5, 9, 2, 4, 1, 6, 8]

decimated by 1:2 would become: [7, 3, 9, 4, 6]

I wonder if it is possible to reshape the array from 1d array N long to one that is 2d and N/2, 2 long then drop the extra dimension?

Ideally, rather than just dump the decimated samples, I would like to find the maximum value across each set (in this example pair) of values. For example: [7, 5, 9, 4, 8]

Is there a way to find the maximum value across each set rather than just to drop it?

The added challenge is that the point here is to plot the values.

The decimation is required because plotting every value is taking too long meaning that I have to reduce the size of an array before plotting it but I need to do this quickly. So for or while loops would take too long.

To answer part of your question you can subsample every other variable by indexing using [::2] — BenT, Jun 13 '19 at 17:33
For quick plotting see https://stackoverflow.com/questions/54449631/improve-min-max-downsampling/54470935#54470935 — user2699, Jun 13 '19 at 17:46
BenT: so that would do the decimation thing. Thanks. Very easy at that level. — Richard, Jun 13 '19 at 17:58

score 5 · Answer 1 · answered Aug 26 '22 at 07:05

It is worth being afraid of simply throwing out readings, because significant readings can be thrown out.

For the tasks that you described, it is worth using decimation.

Unfortunately it is not in numpy, but it is in scipy.

In the code below, I gave an example when discarding samples leads to an error.

As you can see, the original data (blue) has a peak. And manual thinning can just skip it (green). If you apply deciamation from the library, then it will be included in the result (orange).

from scipy import signal
import matplotlib.pyplot as plt
import numpy as np
downsampling_factor = 2

t = np.linspace(0, 1, 50)
y = list(np.random.randint(0,10,int(len(t)/2))) + [50] + list(np.random.randint(0,10,int(len(t)/2-1)))


ydem = signal.decimate(y, downsampling_factor)
t_new = np.linspace(0, 1, len(ydem))

manual_decimation = y[:-downsampling_factor:downsampling_factor]
t_manual_decimation = np.linspace(0, 1, len(manual_decimation))

plt.plot(t, y, '.-', t_new, ydem, 'o-', t_manual_decimation,  manual_decimation, 'x-')
plt.legend(['data', 'scipy decimate', 'manual decimate'], loc='best')
plt.show()

In general, this is not such a trivial task, please be careful.

UPD: note that the length of the vector must be greater than 27.

score 3 · Answer 2 · answered Jun 13 '19 at 20:05

3

to find the maximum:

1) k divides N:

k,N = 3,18
a = np.random.randint(0,10,N)
a
# array([0, 6, 6, 3, 7, 0, 9, 2, 3, 2, 5, 4, 2, 6, 9, 6, 3, 2])
a.reshape(-1,k).max(1)
# array([6, 7, 9, 5, 9, 6])

2) k does not divide N:

k,N = 4,21
a = np.random.randint(0,10,N)
a
# array([4, 4, 6, 0, 0, 1, 7, 8, 2, 3, 0, 5, 7, 1, 1, 5, 7, 8, 3, 1, 7])
np.maximum.reduceat(a, np.arange(0,N,k))
# array([6, 8, 5, 7, 8, 7])

2) should always work but I suspect 1) is faster where applicable

answered Jun 13 '19 at 20:05

Paul Panzer

51,835
3
54
99

As far as I can work out Paul, this technique (1) does not work. The array remains 18 elements long. Tried it a couple of ways and with some variations and no decimation takes place. – Richard Jun 15 '19 at 04:18
The method (2) seems to work when N does divide by k, so one can use (2) instead of (1). – Richard Jun 15 '19 at 04:22
Thank you very much for the suggestions by-the-way Paul. Should have started with that! – Richard Jun 15 '19 at 04:25
@Richard you are welcome. Weird, though that 1) doesn't work for you. If you verbatim copy 1) does it not print a six element array in the end? – Paul Panzer Jun 15 '19 at 06:38
I did, and no it didn't. It returns the original array. I am using Python 3. Does that make a difference? – Richard Jun 15 '19 at 08:03
@Richard Nope. I'm sorry, but that's pretty much impossible, you must be doing something wrong. `a.reshape(-1,3)` yields a 6x3 array and `.max(1)` reduces that along the second axis, producing a 6 element vector. Just to be sure, you do look at the result of that expression, not at `a`? – Paul Panzer Jun 15 '19 at 09:01
Thanks hugely Paul. I am struggling to see what I am doing wrong. I have both re-written it and copied it verbatim. In both cases I get the original back - which is weird as one might expect it to do something - even if it were the wrong thing. Will have another go tonight and see what I get. – Richard Jun 15 '19 at 09:17
Yes indeed Paul, that was it. Got it working now. Much obliged to you, these are simple but highly effective decimation routines for high speed graph drawing. – Richard Jun 16 '19 at 03:46

Gerard Kruisheer · Answer 3 · 2021-04-12T10:42:06.050

3

A quick and dirty way is

k,N = 3,18
a = np.random.randint(0,10,N) #[9, 6, 6, 6, 8, 4, 1, 4, 8, 1, 2, 6, 1, 8, 9, 8, 2, 8]
a = a[:-k:k] #[9, 6, 1, 1, 1]

This should work regardless of k dividing into N or not.

edited Apr 12 '21 at 10:42

answered Jan 21 '21 at 09:04

Gerard Kruisheer

31
3

How do I quickly decimate a numpy array?

3 Answers3