0

I am trying to understand what is happening in python when you perform some operations. For instance, from this reply, I understand how strides are working and how it is important. But now, I would like to know, if after transpose, in the memory, the data haven't been 'physically' transposed, when I am calling .flatten(order="C") after a transpose operation, the data is correctly ordered. Thanks to the strides I know it is definitely possible to implement this operation, unfortunately I can't come up with an algorithm that works for any 'transposed' strides.

import numpy as np

array = np.arange(24).reshape(2, 3, 4)
print(array.flatten(order='C'))
array = array.transpose(1, 0, 2)
print(array.flatten(order='C'))

>>> [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
>>> [ 0  1  2  3 12 13 14 15  4  5  6  7 16 17 18 19  8  9 10 11 20 21 22 23]
Maxime D.
  • 306
  • 5
  • 17

2 Answers2

1

You can check the strides attribute so to see what is going on in practice:

array = np.arange(24).reshape(2, 3, 4)
print(array.strides)                      # (48, 16, 4)
tmp = array.flatten(order='C')
print(tmp.strides)                        # (4,)
array = array.transpose(1, 0, 2)
print(array.strides)                      # (16, 48, 4)
tmp = array.flatten(order='C')
print(tmp.strides)                        # (4,)

As we can see, the flatten array is always contiguous while the (not-flatten) transposed array is not.

Actually, Numpy tries to never copy data unless you request it to do so (or for basic out-of-place operations). That being said, there are cases like this where Numpy have no choice but creating a copy of the target array. Indeed, the stride is always uniform along a given axis (by design) so a flatten transposed array is necessarily contiguous.

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
  • Could you elaborate more on the bold part of your answer? – Chrysophylaxs Mar 29 '23 at 21:50
  • 3
    @Chrysophylaxs, I think he's just saying that like the `shape`, there's one `stride` value per axis. A flattened array is 1d, and thus has 1 stride value, the `itemsize`. While `flatten` always makes a copy, `ravel` like `reshape` "tries" not to, but following something like a `transpose`, it too will be a copy. – hpaulj Mar 30 '23 at 03:05
1

Calling your transposed array, arrt, we see that:

In [372]: arrt.shape, arrt.strides
Out[372]: ((3, 2, 4), (16, 48, 4))

So the strides, adjusted for itemsize, is (4,12,1).

The ravel/flatten can then be produced with:

In [373]: res = np.zeros(arrt.size, int)
     ...: rcnt = 0
     ...: for i in range(0,3):
     ...:     for j in range(0,2):
     ...:         for k in range(0, 4):
     ...:             res[rcnt] = arrt.base[i*4+j*12+k*1]
     ...:             rcnt += 1
     ...:             

In [374]: res
Out[374]: 
array([ 0,  1,  2,  3, 12, 13, 14, 15,  4,  5,  6,  7, 16, 17, 18, 19,  8,
        9, 10, 11, 20, 21, 22, 23])

arrt.base is the original np.arange. In the inner most loop (k) we are stepping through the base by 1, the j loop steps by 12, and the outer by 4.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Thank you @hpaulj, that is exactly what I wanted to know. – Maxime D. Mar 30 '23 at 17:45
  • 1
    The actual compiled code will differ in details, but this gives the general idea of how strides can be used to map from one array shape to another. – hpaulj Mar 30 '23 at 18:11