How to relaibly create a multi-dimensional array and a one-dimensional view of it in numpy, so that the memory layout be contiguous?

Question

According to the documentation of numpy.ravel,

Return a contiguous flattened array.

A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.

For convenience and efficiency of indexing, I would like to have a one-dimensional view of a 2-dimensional array. I am using ravel for creating the view, and so far so good.

However, it is not clear to me what is meant by "A copy is made only if needed." If some day a copy is created while my code is executed, the code will stop working.

I know that there is numpy.reshape, but its documentation says:

It is not always possible to change the shape of an array without copying the data.

In any case, I would like the data to be contiguous.

How can I reliably create at 2-dimensional array and a 1-dimensional view into it? I would like the data to be contiguous in memory (for efficiency). Are there any attributes to specify when creating the 2-dimensional array to assure that it is contiguous and ravel will not need to copy it?

Related question: What is the difference between flatten and ravel functions in numpy?

As the documentation and the provided post say, you cannot create a view of an array that is not contiguous in memory in Numpy. You have to either copy the 2D array to make it contiguous (unless it is already) or to create a 1D copy of the flatten version (which is almost the same operation internally). If you do not want that, then you have to deal with a non contiguous views. This constraint comes from the data layout in (virtual) memory and cannot be avoided (independent of Numpy). — Jérôme Richard, Nov 27 '21 at 13:12
@JérômeRichard, is there a documented guarantee that (1) when i create a 2-D array, it will be stored contiguously, (2) if a 2-D array is stored contiguously, `ravel` will not copy it? — Alexey, Nov 27 '21 at 13:27
(1) I think yes, but if this is critical, you can create a 1D array and reshape it to a 2D one to be 100% sure. (2) yes (unless you play with the F/C order which is generally not used). You can check if an array is a copy/view of another one using `arr.base`. — Jérôme Richard, Nov 27 '21 at 13:35
guess depends on wich order{‘C’,’F’, ‘A’, ‘K’}, you use. @JérômeRichard why arr.base and not numpy.shares_memory https://numpy.org/doc/stable/reference/generated/numpy.shares_memory.html#numpy.shares_memory — pippo1980, Nov 27 '21 at 15:53
@pippo1980 Indeed. I did not know this one ;) . Thanks. In practice, I think the two could be used interchangeably here. — Jérôme Richard, Nov 27 '21 at 16:18

hpaulj · Answer 1 · 2021-11-27T17:15:10.980

The warnings for ravel and reshape are the same. ravel is just reshape(-1), to 1d. Conversely reshape docs tells us that we can think of it as first doing a ravel.

Normal array construction produces a contiguous array, and reshape with the same order will produce a view. You can visually test that by looking at the ravel and checking if the values appear in the expected order.

In [348]: x = np.arange(6).reshape(2,3)
In [349]: x
Out[349]: 
array([[0, 1, 2],
       [3, 4, 5]])
In [350]: x.ravel()
Out[350]: array([0, 1, 2, 3, 4, 5])

I started with the arange, reshaped it to 2d, and back to 1d. No change in order.

But if I make a sliced view:

In [351]: x[:,:2]
Out[351]: 
array([[0, 1],
       [3, 4]])
In [352]: x[:,:2].ravel()
Out[352]: array([0, 1, 3, 4])

This ravel has a gap, and thus is a copy.

Transpose is also a view, which cannot be reshaped to a view:

In [353]: x.T
Out[353]: 
array([[0, 3],
       [1, 4],
       [2, 5]])
In [354]: x.T.ravel()
Out[354]: array([0, 3, 1, 4, 2, 5])

Except, if we specify the right order, the ravel is a view.

In [355]: x.T.ravel(order='F')
Out[355]: array([0, 1, 2, 3, 4, 5])

reshape has a extensive discussion of order. And transpose actually works by returning a view with different shape and strides. For a 2d array transpose produces a order F array.

So as long as you are aware of manipulations like this, you can safely assume that the reshape/ravel is contiguous.

Note that even though [354] is a copy, assignment to the flat changes the original

In [361]: x[:,:2].flat[:] = [3,4,2,1]
In [362]: x
Out[362]: 
array([[3, 4, 2],
       [2, 1, 5]])

x[:,:2].ravel()[:] = [10,11,2,3] does not change x. In cases like this y = x[:,:2].flat may be more useful than the ravel equivalent.

https://numpy.org/doc/stable/reference/generated/numpy.ravel.html F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. --> @hpaulj sorry to bother, I got the column-major bit but what does changin fastes slowest means ? — pippo1980, Nov 27 '21 at 16:38
@pippo1980, in my [353] example, we values in order by going down a column, then over one, and down that column etc. That's in contrast to [349] where we get consecutive values by going across rows. That's all the "fastest" means. Don't worry about actual execution speeds. — hpaulj, Nov 27 '21 at 16:45

How to relaibly create a multi-dimensional array and a one-dimensional view of it in numpy, so that the memory layout be contiguous?

1 Answers1