Speeding up double for loop for downsampling a numpy array

Question

I am trying to find the most efficient way of downsampling an aribtrarily shaped 2d numpy array into a smaller (or potentially larger) square array - I want to take the max of each sub-array and put it into the new array. Here is my code:

import numpy as np

def downSampleMax(inputArray, numFrames):

    newArray = np.zeros([numFrames, numFrames], dtype = np.uint8)
    filterSize = (int(np.ceil(float(inputArray.shape[0]) / numFrames)), 
                  int(np.ceil(float(inputArray.shape[1]) / numFrames)))
    rowArr = np.linspace(0, inputArray.shape[0] - filterSize[0], numFrames, dtype = np.int)
    colArr = np.linspace(0, inputArray.shape[1] - filterSize[1], numFrames, dtype = np.int)

    for iRow in range(numFrames):
        for iCol in range(numFrames):
            newArray[iRow, iCol] = np.max(inputArray[rowArr[iRow]: rowArr[iRow] + filterSize[0], 
                                                     colArr[iCol]: colArr[iCol] + filterSize[1]])

    return newArray

Any ideas on how to speed this up significantly? I think from what I've read that vectorization or slicing might be the way forward but no idea how to do that.

Instead of `float(a) / b`, it is probably better to add a `from __future__ import division` at the top of your script, and then simply do `a/b`. — Bas Swinckels, Aug 04 '15 at 20:34
Cool, thanks Bas - do you know if I'd still have to do that in Python 3? I'm using Python 2 for this project as I have to interface with Theano *** edit - just tried it, appears not! *** — vahndi, Aug 04 '15 at 20:40
The `from __future__` is necessary in Python2, in Python3 this is the default, see [PEP 238](https://www.python.org/dev/peps/pep-0238/). If you are still using Python2 for whatever reason, adding that line will make it easier to port your code later. — Bas Swinckels, Aug 04 '15 at 20:47

score 1 · Accepted Answer · answered Oct 04 '15 at 09:20

Here's a vectorized approach that does the blockwise max finding calculations in as NumPythonic way as it could get -

# Get shape of input array
M,N = inArr.shape

# Reshape 2D input array to break it down a 4D array with rows and columns 
# being divided by the number of frames. Then, use `max` along the divided axes.
out = (inArr.reshape(nFrames,M/nFrames,nFrames,N/nFrames)).max(axis=(1,3))

Sample run -

In [107]: inArr
Out[107]: 
array([[ 37,  47,  35,  96,  84,  55,  36, 218],
       [  4, 199, 104, 246, 136, 123, 215, 237],
       [ 21,  50,  21, 151, 243, 130,  69, 166],
       [242, 233, 161,  57,  20, 122, 189, 120],
       [175,  52, 114, 202,  13, 134,  30,  29],
       [117, 128,  88, 108,  97, 249,  87,  37],
       [246, 237, 230, 166, 157, 215, 254, 219],
       [156,  41,  71,  94, 114, 174, 223,  25]], dtype=uint8)

In [108]: nFrames = 4

In [109]: downSampleMax(inArr, nFrames)
Out[109]: 
array([[199, 246, 136, 237],
       [242, 161, 243, 189],
       [175, 202, 249,  87],
       [246, 230, 215, 254]], dtype=uint8)

In [110]: M,N = inArr.shape

In [111]: (inArr.reshape(nFrames,M/nFrames,nFrames,N/nFrames)).max(axis=(1,3))
Out[111]: 
array([[199, 246, 136, 237],
       [242, 161, 243, 189],
       [175, 202, 249,  87],
       [246, 230, 215, 254]], dtype=uint8)

Runtime tests -

In [112]: inArr = np.random.randint(0,255,(512,512)).astype('uint8')

In [113]: nFrames = 32

In [114]: def vectorized_downSampleMax(inArr, nFrames):
     ...:   M,N = inArr.shape
     ...:   return (inArr.reshape(nFrames,M/nFrames,nFrames,N/nFrames)).max(axis=(1,3))
     ...: 
     ...: 
     ...: def downSampleMax(inArr, nFrames):
     ...: 
     ...:   newArray = np.zeros([nFrames, nFrames], dtype = np.uint8)
     ...:   filterSize = (int(np.ceil(float(inArr.shape[0]) / nFrames)), 
     ...:                 int(np.ceil(float(inArr.shape[1]) / nFrames)))
     ...:   rowArr = np.linspace(0, inArr.shape[0] - filterSize[0], nFrames).astype(int)
     ...:   colArr = np.linspace(0, inArr.shape[1] - filterSize[1], nFrames).astype(int)
     ...: 
     ...:   for iRow in range(nFrames):
     ...:       for iCol in range(nFrames):
     ...:           newArray[iRow, iCol] = np.max(inArr[rowArr[iRow]: rowArr[iRow] + filterSize[0], colArr[iCol]: colArr[iCol] + filterSize[1]])
     ...: 
     ...:   return newArray
     ...: 

In [115]: %timeit downSampleMax(inArr, nFrames)
10 loops, best of 3: 20.1 ms per loop

In [116]: %timeit vectorized_downSampleMax(inArr, nFrames)
1000 loops, best of 3: 909 µs per loop

I'm looking forward to figuring out what this exactly does:) Aah, I see now. `nFrames` is not the size for averaging, but the size of the final image. Very good! — Andras Deak -- Слава Україні, Oct 04 '15 at 09:38
@AndrasDeak Ah you following numpy tags!? Well, it's just a reshaping trick, similar to http://stackoverflow.com/questions/26280418/sum-over-blocks-in-a-2d-matrix-matlab/26281056#26281056 and http://stackoverflow.com/questions/26871083/how-can-i-vectorize-the-averaging-of-2x2-sub-arrays-of-numpy-array — Divakar, Oct 04 '15 at 09:40
Thanks a lot for the links! Yeah, I'm sort-of learning python/numpy, and I happened to notice a post answered by @Divakar. Learning ensues;) — Andras Deak -- Слава Україні, Oct 04 '15 at 09:50
@AndrasDeak Very kool! Transitioning to Numpy from MATLAB is fairly easy, so go you! :) — Divakar, Oct 04 '15 at 09:51
I think this only works when each dimension of the input array is exactly divisible by nFrames, but this is a big speed improvement so I can do some array padding or cropping before the reshaping. Thanks @Divakar! — vahndi, Oct 05 '15 at 10:35
@cohdez Yes, for such corner cases, `padding with zeros` would be the preferred approach I think. — Divakar, Oct 05 '15 at 10:48

Speeding up double for loop for downsampling a numpy array

1 Answers1