Changing structure of numpy array using most common value

Question

How can I downscale the raster data of 4*6 size into 2*3 size using 'mode' i.e., most common value with in 2*2 pixels?

import numpy as np
data=np.array([
[0,0,1,1,1,1],
[1,0,0,1,1,1],
[1,0,1,1,0,1],
[1,1,0,1,0,0]])

The result should be:

result = np.array([
    [0,1,1],
    [1,1,0]])

I was expecting result to be full of `1`s, because in any `2x2` block there is always a `1`. — gg349, May 05 '14 at 12:19
Since the maximum occurred element among '0,0,1,0' in the first 2*2 block is 0, I want to get 0 as the output. — , May 05 '14 at 12:22
I have asked different question for desired result as you have mentioned — , May 05 '14 at 12:24
@neha - I have changed "maximum occurred" to "most common" which I think is easier to understand. I hope you don't mind. — mtrw, May 05 '14 at 15:41

score 1 · Answer 1 · answered May 05 '14 at 12:57

Here's one way to go,

from itertools import product
from numpy import empty,argmax,bincount
res = empty((data.shape[0]/2,data.shape[1]/2))
for j,k in product(xrange(res.shape[0]),xrange(res.shape[1])):
    subvec = data[2*j:2*j+2,2*k:2*k+2].flatten()
    res[j,k]=argmax(bincount(subvec))

This works as long as the input data contains an integer number of 2x2 blocks.

Notice that a block like [[0,0],[1,1]] will lead 0 as result, because argmax returns the index of the first occurrence only. Use res[j,k]=subvec.max()-argmax(bincount(subvec)[::-1]) if you want these 2x2 blocks to count as 1.

score 1 · Accepted Answer · edited May 23 '17 at 12:11

Please refer to this thread for a full explanation. The following code will calculate your desired result.

from sklearn.feature_extraction.image import extract_patches

data=np.array([
    [0,0,1,1,1,1],
    [1,0,0,1,1,1],
    [1,0,1,1,0,1],
    [1,1,0,1,0,0]])

patches = extract_patches(data, patch_shape=(2, 2), extraction_step=(2, 2))
most_frequent_number = ((patches > 0).sum(axis=-1).sum(axis=-1) > 2).astype(int)
print most_frequent_number

score 1 · Answer 3 · edited May 23 '17 at 11:57

There appears to be more than one statistic you wish to collect about each block. Using toblocks (below) you can apply various computations to the last axis of blocks to obtain the desired statistics:

import numpy as np
import scipy.stats as stats

def toblocks(arr, nrows, ncols):
    h, w = arr.shape
    blocks = (arr.reshape(h // nrows, nrows, -1, ncols)
              .swapaxes(1, 2)
              .reshape(h // nrows, w // ncols, ncols * nrows))
    return blocks

data=np.array([
    [0,0,1,1,1,1],
    [1,0,0,1,1,1],
    [1,0,1,1,0,1],
    [1,1,0,1,0,0]])

blocks = toblocks(data, 2, 2)
vals, counts = stats.mode(blocks, axis=-1)
vals = vals.squeeze()
print(vals)
# [[ 0.  1.  1.]
#  [ 1.  1.  0.]]

Changing structure of numpy array using most common value

3 Answers3

Linked