8

I have an image stored as a 2d numpy array (possibly multi-d).

I can make a view onto that array that reflects a 2d sliding window, but when I reshape it so that each row is a flattened window (rows are windows, column is a pixel in that window) python makes a full copy. It does this because I'm using the typical stride trick, and the new shape isn't contiguous in memory.

I need this because I'm passing entire large images to an sklearn classifier, which accepts 2d matrices, where there's no batch/partial fit procedure, and the full expanded copy is far too large for memory.

My Question: Is there a way to do this without making a fully copy of the view?

I believe an answer will either be (1) something about strides or numpy memory management that I've overlooked, or (2) some kind of masked memory structure for python that can emulate a numpy array even to an external package like sklearn that includes cython.

This task of training over moving windows of a 2d image in memory is common, but the only attempt I know of to account for patches directly is the Vigra project (http://ukoethe.github.io/vigra/).

Thanks for the help.

>>> A=np.arange(9).reshape(3,3)
>>> print A
[[0 1 2]
 [3 4 5]
 [6 7 8]]
>>> xstep=1;ystep=1; xsize=2; ysize=2
>>> window_view = np.lib.stride_tricks.as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
...       (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
>>> print window_view 
[[[[0 1]
   [3 4]]

  [[1 2]
   [4 5]]]


 [[[3 4]
   [6 7]]

  [[4 5]
   [7 8]]]]
>>> 
>>> np.may_share_memory(A,window_view)
True
>>> B=window_view.reshape(-1,xsize*ysize)
>>> np.may_share_memory(A,B)
False
locallyoptimal
  • 113
  • 1
  • 6
  • 4
    I think this is impossible, even you pass an `as_strided` array to a sklearn classifier, I think most (if not all) of the classifiers will copy your data if it's not continuous. – HYRY Jul 18 '14 at 03:22
  • 2
    Yeah I'm pretty sure that can not be done. Sorry. If you find a way, let me know ;) Also: directly inputting an image might not be a good idea and computing features might solve your problem. – Andreas Mueller Jul 18 '14 at 12:58
  • 1
    Definitely rule out number (1), `sklearn.feature_extraction.image.extract_patches` gives you exactly the view you are talking about, and reshaping it will definitely make a copy, according to numpy rules. Are you sure you need all patches of many images at once? You may want to look into online/batched algorithms for whatever your objective is. Try `SGDClassifier` for instance. – eickenberg Jul 18 '14 at 21:31
  • @HYRY Depends on the estimator, really. Contiguous data is usually not a requirement. – Fred Foo Jul 21 '14 at 16:00

1 Answers1

4

Your task isn't possible using only strides, but NumPy does support one kind of array that does the job. With strides and masked_array you can create the desired view to your data. However, not all NumPy functions support operations with masked_array, so it is possible the scikit-learn doesn't do well with these either.

Let's first take a fresh look at what we are trying to do here. Consider the input data of your example. Fundamentally the data is just a 1-d array in the memory, and it is simpler if we think about the strides with that. The array only appears to be 2-d, because we have defined its shape. Using strides, the shape could be defined like this:

from numpy.lib.stride_tricks import as_strided

base = np.arange(9)
isize = base.itemsize
A = as_strided(base, shape=(3, 3), strides=(3 * isize, isize))

Now the goal is to set such strides to base that it orders the numbers like in the end array, B. In other words, we are asking for integers a and b such that

>>> as_strided(base, shape=(4, 4), strides=(a, b))
array([[0, 1, 3, 4],
       [1, 2, 4, 5],
       [3, 4, 6, 7],
       [4, 5, 7, 8]])

But this is clearly impossible. The closest view we can achieve like this is with a rolling window over base:

>>> C = as_strided(base, shape=(5, 5), strides=(isize, isize))
>>> C
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])

But the difference here is that we have extra columns and rows, which we would like to get rid of. So, effectively we are asking for a rolling window which is not contiguous and also makes jumps at regular intervals. With this example we want to have every third item excluded from the window and jump over one item after two rows.

We can describe this as a masked_array:

>>> mask = np.zeros((5, 5), dtype=bool)
>>> mask[2, :] = True
>>> mask[:, 2] = True
>>> D = np.ma.masked_array(C, mask=mask)

This array contains exactly the data that we want, and it is only a view to the original data. We can confirm that the data is equal

>>> D.data[~D.mask].reshape(4, 4)
array([[0, 1, 3, 4],
       [1, 2, 4, 5],
       [3, 4, 6, 7],
       [4, 5, 7, 8]])

But as I said in the beginning, it is quite likely that scikit-learn doesn't understand masked arrays. If it simply converts this to an array, the data will be wrong:

>>> np.array(D)
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])
jasaarim
  • 1,806
  • 15
  • 19
  • Your answer came up as a related link to http://stackoverflow.com/a/35805797/901925. The OP wanted to reshape a block view without copying. Your masking is tempting - except many `ma` functions operate by using `filled` to replace masked values with innocuous ones (e.g. `filled(0)` for `ma.sum`). That's a temporary copy for each masked operation. – hpaulj Mar 04 '16 at 23:00