0

I am looking to have two different views of the same data with the rows in a different order such that changes done through one view will be reflected in the other. Specifically, the following code

# Create original array
A = numpy.array([[0, 1, 2],
                 [3, 4, 5],
                 [6, 7, 8]])
B = A.view()[[0, 2, 1], :] # Permute the rows
print("(before) B =\n", B)

# Change a value in A
A[1, 2] = 143
print("(after) A =\n", A)
print("(after) B =\n", B)

has the following output:

(before) B =
 [[0 1 2]
  [6 7 8]
  [3 4 5]]
(after) A =
 [[  0   1   2]
  [  3   4 143]
  [  6   7   8]]
(after) B =
 [[0 1 2]
  [6 7 8]
  [3 4 5]]

but I would like the last bit of that to be

(after) B =
 [[0   1   2]
  [6   7   8]
  [3   4 143]]

Answers to this question state that getting a view at specific indices is not possible, though the OP for that question is asking about a subset of the array, whereas I would like a view of the entire array. (It seems that the key difference here is slicing vs. smart indexing)

A different post asking about slicing by rows and then columns vs columns and then rows has an accepted answer that states "All that matters is whether you slice by rows or by columns...". So I tried dealing with a flattened view of the array..

A = numpy.array([[0, 1, 2],
                 [3, 4, 5],
                 [6, 7, 8]])
B = A.view()
B.shape = (A.size,)

A[1, 2] = 198
print("(After first) A =\n", A)
print("(After first) B =\n", B)

# Identity index map
all_idx = numpy.arange(A.size).reshape(A.shape)

# Swapped and flattened index map
new_row_idx = all_idx[[0, 2, 1]].flatten()

C = B[new_row_idx]

print("(Before second) C =\n", C)

# Manipulate through 'B'
B[7] = 666

print("(After second) B =\n", B)
print("(After second) C =\n", C)

which gives the following output:

(After first) A =
 [[  0   1   2]
 [  3   4 198]
 [  6   7   8]]
(After first) B =
 [  0   1   2   3   4 198   6   7   8]
(Before second) C =
 [  0   1   2   6   7   8   3   4 198]
(After second) B =
 [  0   1   2   3   4 198   6 666   8]
(After second) C =
 [  0   1   2   6   7   8   3   4 198]

As you can see, the 4th entry of C is unaltered. The suggested solution to the first post I mentioned is to create a copy, make changes, and then update the original array. I can write functions to wrap this, but this doesn't eliminate the number of times I will be making copies. All it does is hide it from the user.

What am I missing here? Should I be using the data attribute of these arrays? If so, what is a good starting point for understanding how to do this?

M.A.R.
  • 37
  • 5
  • 1
    Careful with the Zwinck answer in the second link. It narrowly addresses the OP question, and shouldn't be taken as a general explanation. Read the comments. – hpaulj Jan 07 '21 at 07:50
  • There fundamentally is no way to do this, at least in the narrow way this question is framed. Numpy works on contiguous chunks of memory; there are workarounds, but arbitrary permutations of the data will always break contiguity. It's really that simple... – senderle Jan 07 '21 at 09:31
  • However, if your fundamental goal is to write to an array in a permuted way, you can do so pretty easily — much more easily than in the code at the bottom of your question. (That is, unless I have misunderstood your goals.) But that's quite a different question than the one you've asked. – senderle Jan 07 '21 at 09:31
  • What kind of operations do you want to do on your permuted array? Simple one-by-one assignment like this or something more involved? You could possibly subclass. – Daniel F Jan 07 '21 at 10:11
  • My goal is to support different node ID numbering schemes used by different libraries that support operations on/with finite element meshes. The set of node coordinates does not change, but the global ID assigned to them is different. I would like to do this without copying the mesh data since this can be rather large in some cases. – M.A.R. Jan 07 '21 at 16:54

1 Answers1

2

An array has a shape, strides, dtype and 1d data_buffer. A view will have its own shape, strides, dtype, and pointer to some place in the base's data_buffer. Indexing with a slice can be achieved with just these attributes.

But indexing with a list such as your [0,2,1] cannot be achieved this way. So numpy makes a new array with its own data_buffer, a copy. That [0,2,1] index list/array is not stored with the copy.

In [43]: A = np.arange(9).reshape(3,3)
In [44]: B = A[[0,2,1],:]
In [45]: A
Out[45]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
In [46]: B
Out[46]: 
array([[0, 1, 2],
       [6, 7, 8],
       [3, 4, 5]])

ravel shows the order of elements in the data_base:

In [47]: A.ravel()
Out[47]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])

The order of elements in B is different.

In [48]: B.ravel()
Out[48]: array([0, 1, 2, 6, 7, 8, 3, 4, 5])

In contrast, consider a row reordering with a slice:

In [49]: C = A[::-1,:]
In [50]: C
Out[50]: 
array([[6, 7, 8],
       [3, 4, 5],
       [0, 1, 2]])

In [52]: A.strides
Out[52]: (24, 8)

This is achieved by simply changing the strides:

In [53]: C.strides
Out[53]: (-24, 8)

Transpose is also a view, with changed strides:

In [54]: D = A.T
In [55]: D.strides
Out[55]: (8, 24)

I was going to show the C.ravel(), but realized that reshape makes a copy (even though C is a view).

The fundamental point is that anything that numpy describes as advanced indexing will make a copy. Changes to the copy will not appear in the original array. https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing

hpaulj
  • 221,503
  • 14
  • 230
  • 353