0

I have a 2D array that I would like to down sample to compare it to another.

Lets say my array x is 512x512, I'd like an array y 128x128 where the elements of y are build using an interpolation of the values overs 4x4 blocks of x (this interpolation could just be taking the average, but other methodes, like geometric average, could be interesting)

So far I looked at scipy.ndimage.interpolation.zoom but I don't get the results I want

>> x = np.arange(16).reshape(4,4)
>> print(x)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
>> y = scipy.ndimage.interpolation.zoom(x, 0.5)
>> print(y)
[[ 0  3]
 [12 15]]

I expected y to be

[[ 2.5  4.5]
 [10.5 12.5]]

Note that simply setting dtype=np.float32 doesn't solve that ...

Amxx
  • 3,020
  • 2
  • 24
  • 45

2 Answers2

1

sklearn.feature_extraction.image.extract_patches cleverly uses np.lib.stride_tricks.as_strided to produce a windowed array that can be operated on.

The sliding_window function, found here Efficient Overlapping Windows with Numpy, produces a windowed array with or without overlap also and let's you get a glimpse of what is happening under the hood.

>>> a = np.arange(16).reshape(4,4)

step_height,step_width determines the overlap for the windows - in your case the steps are the same as the window size, no overlap.

>>> window_height, window_width, step_height, step_width = 2, 2, 2, 2
>>> y = sliding_window(a, (window_height, window_width), (step_height,step_width))
>>> y
array([[[ 0,  1],
        [ 4,  5]],

       [[ 2,  3],
        [ 6,  7]],

       [[ 8,  9],
        [12, 13]],

       [[10, 11],
        [14, 15]]])

Operate on the windows:

>>> y = y.mean(axis = (1,2))
>>> y
array([  2.5,   4.5,  10.5,  12.5])

You need to determine the final shape depending on the number of windows.

>>> final_shape = (2,2)
>>> y = y.reshape(final_shape)
>>> y
array([[  2.5,   4.5],
       [ 10.5,  12.5]])

Searching SO for numpy, window, array should produce numerous other answers and possible solutions.

wwii
  • 23,232
  • 7
  • 37
  • 77
0

What you seem to be looking for is the mean over blocks of 4, which is not obtainable with zoom, since zoom uses interpolation (see its docstring)

To obtain what you show, try the following

import numpy as np
x = np.arange(16).reshape(4, 4)

xx = x.reshape(len(x) // 2, 2, x.shape[1] // 2, 2).transpose(0, 2, 1, 3).reshape(len(x) // 2, x.shape[1] // 2, -1).mean(-1)

print xx

This yields

[[  2.5   4.5]
 [ 10.5  12.5]]

Alternatively, this can be done using sklearn.feature_extraction.image.extract_patches

from sklearn.feature_extraction.image import extract_patches

patches = extract_patches(x, patch_shape=(2, 2), extraction_step=(2, 2))

xx = patches.mean(-1).mean(-1)

print xx

However, if your goal is to subsample an image in a graceful way, then taking the mean over blocks of the image is not the right way to do it: It is likely to cause aliasing effects. What you should do in this case is smooth the image ever so slightly using scipy.ndimage.gaussian_filter (e.g. sigma=0.35 * subsample_factor) and then subsample simply by indexing [::2, ::2]

eickenberg
  • 14,152
  • 1
  • 48
  • 52
  • I'd like this operation to compare scientific data at different sampling level ... not to process images in the conventional way Therefore I don't know if applying a gaussian filter whould be a good thing or not in my case. I'll keep it in mind – Amxx Nov 10 '14 at 15:23
  • All the rules of sampling then still apply right? Aliasing is not image specific but can happen with any continuous signal that you try to approximate with discrete measurements. Averaging adjacent values corresponds to multiplying with a `sinc` type function in fourier space, which has infinite support and doesn't decrease well, whereas gaussian smoothing, despite its infinite support, is at least rapidly decreasing. – eickenberg Nov 10 '14 at 15:46