0

How can I downscale the raster data of 4 X 6 size into 2 X 3 size enforcing '1' to be chosen if any element with in 2*2 pixels include 1, otherwise 0?

import numpy as np
data=np.array([
[0,0,1,1,0,0],
[1,0,0,1,0,0],
[1,0,1,0,0,0],
[1,1,0,0,0,0]])

The result should be:

result = np.array([
    [1,1,0],
    [1,1,0]])
sshashank124
  • 31,495
  • 9
  • 67
  • 76

2 Answers2

2
import numpy as np    

def toblocks(arr, nrows, ncols):
    h, w = arr.shape
    blocks = (arr.reshape(h // nrows, nrows, -1, ncols)
              .swapaxes(1, 2)
              .reshape(h // nrows, w // ncols, ncols * nrows))
    return blocks    

data = np.array([[0, 0, 1, 1, 0, 0],
                 [1, 0, 0, 1, 0, 0],
                 [1, 0, 1, 0, 0, 0],
                 [1, 1, 0, 0, 0, 0]])


blocks = toblocks(data, 2, 2)
downscaled = blocks.any(axis=-1).astype(blocks.dtype)
print(downscaled)
# [[1 1 0]
#  [1 1 0]]

Where the above solution comes from: A while ago, an SO question asked how to break an array into blocks. All I did was slightly modify that solution to apply any to each of the blocks.

Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
2

You could use the patch extraction routine of scikit learn as follows (you should be able to copy and paste):

from sklearn.feature_extraction.image import extract_patches

data = np.array([[0, 0, 1, 1, 0, 0],
                 [1, 0, 0, 1, 0, 0],
                 [1, 0, 1, 0, 0, 0],
                 [1, 1, 0, 0, 0, 0]])

patches = extract_patches(data, patch_shape=(2, 2), extraction_step=(2, 2))
non_zero_count_patches = (patches > 0).any(axis=-1).any(axis=-1).astype(int)
print non_zero_count_patches

Explanation: the function extract_patches generates a view on your array that represents sliding patches of size patch_shape and of discretization step extraction_step, which you can vary as you want. The following line checks which of the patches contains a non zero item. However, this can be replaced by anything else you may be interested in, such as the mean, sum, etc. An advantage is that you can choose patch size and extraction step freely (they do not need to correspond), without memory overhead until any is invoked (it uses strides internally).

eickenberg
  • 14,152
  • 1
  • 48
  • 52
  • thanks, i liked your solution. does the problem in this question can be solved similarly? http://stackoverflow.com/questions/23472206/changing-structure-of-numpy-array-using-maximum-occurred-value –  May 05 '14 at 13:33
  • indeed it can. If, as is your case, you are working only with 0s and 1s, you can replace `.any(axis=-1).any(axis=-1).astype(int)` with `.sum(-1).sum(-1) > 1`, depending on the decision rule for tie breakers, it may also be `> 2` – eickenberg May 05 '14 at 13:39
  • In general, the extracted patch data are to be found on the last two axes of the 4D array `patches`. The patches themselves are indexed using the first two axes. So any operation you may be interested in needs to be applied to the last two axes (which I did in the examples, using -1 for the last axis twice, so 4th and then on the reduced array the 3rd). – eickenberg May 05 '14 at 13:48
  • @eickenberg: If you reshape the 4D array to combine the last to axes into 1 axis, then perhaps you can do the rest of the computation with one function call.... – unutbu May 05 '14 at 13:50
  • That was my first solution, but I changed it back, because I thought the direct summation may cause less memory overhead, but I am now doubtful whether this is true. So yes, reshaping the last two axes into one is definitely an option. – eickenberg May 05 '14 at 13:52
  • @neha if you need to extract many different features from your data you can obtain them all after having called `patches = extract_patches(...)`. Then it may be interesting to call `patches = patches.reshape(2, 3, -1)` and apply your operations to the last axis, as many of them as you want. In this setting, this will need 4x your array size in memory. – eickenberg May 05 '14 at 13:54
  • Actually, I am wrong, this creates one extra array of the size of your data. As does the other proposed solution – eickenberg May 05 '14 at 14:18