I have an image stored as a 2d numpy array (possibly multi-d).
I can make a view onto that array that reflects a 2d sliding window, but when I reshape it so that each row is a flattened window (rows are windows, column is a pixel in that window) python makes a full copy. It does this because I'm using the typical stride trick, and the new shape isn't contiguous in memory.
I need this because I'm passing entire large images to an sklearn classifier, which accepts 2d matrices, where there's no batch/partial fit procedure, and the full expanded copy is far too large for memory.
My Question: Is there a way to do this without making a fully copy of the view?
I believe an answer will either be (1) something about strides or numpy memory management that I've overlooked, or (2) some kind of masked memory structure for python that can emulate a numpy array even to an external package like sklearn that includes cython.
This task of training over moving windows of a 2d image in memory is common, but the only attempt I know of to account for patches directly is the Vigra project (http://ukoethe.github.io/vigra/).
Thanks for the help.
>>> A=np.arange(9).reshape(3,3)
>>> print A
[[0 1 2]
[3 4 5]
[6 7 8]]
>>> xstep=1;ystep=1; xsize=2; ysize=2
>>> window_view = np.lib.stride_tricks.as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
... (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
>>> print window_view
[[[[0 1]
[3 4]]
[[1 2]
[4 5]]]
[[[3 4]
[6 7]]
[[4 5]
[7 8]]]]
>>>
>>> np.may_share_memory(A,window_view)
True
>>> B=window_view.reshape(-1,xsize*ysize)
>>> np.may_share_memory(A,B)
False