Subsampling/averaging over a numpy array

Question

I have a numpy array with floats.

What I would like to have (if it is not already existing) is a function that gives me a new array of the average of every x points in the given array, like sub sampling (and opposite of interpolation(?)).

E.g. sub_sample(numpy.array([1, 2, 3, 4, 5, 6]), 2) gives [1.5, 3.5, 5.5]

E.g. Leftovers can be removed, e.g. sub_sample(numpy.array([1, 2, 3, 4, 5]), 2) gives [1.5, 3.5]

Thanks in advance.

Possible duplicate of [Averaging over every n elements of a numpy array](https://stackoverflow.com/questions/15956309/averaging-over-every-n-elements-of-a-numpy-array) — Bas Swinckels, Jun 11 '18 at 10:11

Chris · Accepted Answer · 2012-06-01T09:47:59.907

31

Using NumPy routines you could try something like

import numpy

x = numpy.array([1, 2, 3, 4, 5, 6])

numpy.mean(x.reshape(-1, 2), 1) # Prints array([ 1.5,  3.5,  5.5])

and just replace the 2 in the reshape call with the number of items you want to average over.

Edit: This assumes that n divides into the length of x. You'll need to include some checks if you are going to turn this into a general function. Perhaps something like this:

def average(arr, n):
    end =  n * int(len(arr)/n)
    return numpy.mean(arr[:end].reshape(-1, n), 1)

This function in action:

>>> x = numpy.array([1, 2, 3, 4, 5, 6])
>>> average(x, 2)
array([ 1.5,  3.5,  5.5])

>>> x = numpy.array([1, 2, 3, 4, 5, 6, 7])
>>> average(x, 2)
array([ 1.5,  3.5,  5.5])

edited Jun 01 '12 at 09:47

answered Jun 01 '12 at 09:40

Chris

44,602
16
137
156

2

This one works fine, except when the window size (2 in example above) is not a multiplication of the length of the array but I can make sure this is. Thanks! – Michel Keijzers Jun 01 '12 at 09:46
thanks ... yes that was exactly what I also was thinking about. – Michel Keijzers Jun 01 '12 at 09:54
3

Is there an easy way to generalize this to downsampling a single axis, in a multidimensional array? e.g. average an array of shape [8,4] down to [4,4] ? – DilithiumMatrix Jul 24 '13 at 04:23
Could you provide a solution where i could enter a floating downsampling rate. E.g 2.7 – maniac Dec 02 '15 at 23:22
1

@maniac If you have a question that isn't answered here, please post a new question, rather than commenting on an existing answer. – Chris Dec 04 '15 at 11:17

Maria Zverina · Answer 2 · 2012-06-01T09:40:11.270

3

def subsample(data, sample_size):
    samples = list(zip(*[iter(data)]*sample_size))   # use 3 for triplets, etc.
    return map(lambda x:sum(x)/float(len(x)), samples)

l = [1, 2, 3, 4, 5, 6]

print subsample(l, 2)
print subsample(l, 3)
print subsample(l, 5)

Gives:

[1.5, 3.5, 5.5]
[2.0, 5.0]
[3.0]

edited Jun 01 '12 at 09:40

answered Jun 01 '12 at 09:33

Maria Zverina

10,863
3
44
61

2

Thank you I will try it, however I hope there will be a numpy function because they tend to be around 10 times faster as most similar Python function. – Michel Keijzers Jun 01 '12 at 09:39

mchrgr2000 · Answer 3 · 2019-01-31T03:33:26.993

-1

this is also a one line solution that works:

downsampled_a = [a[i:n+i].mean() for i in range(0,size(a),n)]

"a" is the vector with your data and "n" is your sampling step.

PS: from numpy import *

edited Jan 31 '19 at 03:33

answered Nov 02 '17 at 02:15

mchrgr2000

61
4

It returns `[1.5, 3.5, 5.0]` - not `[1.5, 3.5]` as desired by OP. Also use `np.size()` instead of importing all from `numpy`. – AGN Gazer Nov 02 '17 at 03:15
The above one-liner returns exactly what asked: [1.5, 3.5, 5.5] not [1.5, 3.5, 5.0]. The leftover -of course- can be removed (see the examples in the original question). – mchrgr2000 Jan 31 '19 at 03:12
numpy.size() can be avoided. len() it is enough... (^_^)/* – mchrgr2000 Jan 31 '19 at 03:26

Subsampling/averaging over a numpy array

3 Answers3