So for a smaller m
:
In [513]: m = np.mgrid[:3,:4]
In [514]: m.shape
Out[514]: (2, 3, 4)
In [515]: m
Out[515]:
array([[[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2]],
[[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]]])
In [516]: ll = list(zip(*(v.ravel() for v in m)))
In [517]: ll
Out[517]:
[(0, 0),
(0, 1),
(0, 2),
...
(2, 3)]
In [518]: a2=np.empty(m.shape[1:], dtype=object)
In [519]: a2.ravel()[:] = ll
In [520]: a2
Out[520]:
array([[(0, 0), (0, 1), (0, 2), (0, 3)],
[(1, 0), (1, 1), (1, 2), (1, 3)],
[(2, 0), (2, 1), (2, 2), (2, 3)]], dtype=object)
Making an empty of the right shape, and filling it via [:]=
is the best way of controlling the object
depth of such an array. np.array(...)
defaults to the highest possible dimension, which in this case would 3d.
So the main question is - is there a better way of constructing that ll
list of tuples.
a2.ravel()[:] = np.array(ll)
does not work, complaining (12,2) into shape (12)
.
Working backwards, if I start with an array like ll
, turn it into a nested list, the assignment works, except elements of a2
are lists, not tuples:
In [533]: a2.ravel()[:] = np.array(ll).tolist()
In [534]: a2
Out[534]:
array([[[0, 0], [0, 1], [0, 2], [0, 3]],
[[1, 0], [1, 1], [1, 2], [1, 3]],
[[2, 0], [2, 1], [2, 2], [2, 3]]], dtype=object)
m
shape is (2,3,4)and
np.array(ll)shape is (12,2), then
m.reshape(2,-1).T` produces the same thing.
a2.ravel()[:] = m.reshape(2,-1).T.tolist()
I could have transposed first, and then reshaped, m.transpose(1,2,0).reshape(-1,2)
.
To get tuples I need to pass the reshaped array through a comprehension:
a2.ravel()[:] = [tuple(l) for l in m.reshape(2,-1).T]
===============
m.transpose(1,2,0).astype(object)
is still 3d; it's just changed the integers with pointers to integers. There's a 'wall' between the array dimensions and the dtype. Things like reshape and transpose only operate on the dimensions, and don't penetrate that wall, or move it. Lists are pointers all the way down. Object arrays use pointers only at the dtype
level.
Don't be afraid of the a2.ravel()[:]=
expression. ravel
is a cheap reshape, and assignment to a flatten version of an array may actually be faster than assignment to 2d version. After all, the data (in this case pointers) is stored in a flat data buffer.
But (after playing around a bit) I can do the assignment without the ravel or reshape (still need the tolist
to move the object
boundary). The list nesting has to match the a2
shape down to 'object' level.
a2[...] = m.transpose(1,2,0).tolist() # even a2[:] works
(This brings to mind a discussion about giving np.array
a maxdim
parameter - Prevent numpy from creating a multidimensional array).
The use of tolist
seems like an inefficiency. But if the elements of a2
are tuples (or rather pointers to tuples), those tuples have to be created some how. The c
databuffer of the m
cannot be viewed as a set of tuples. tolist
(with the [tuple...]
comprehension) might well be the most efficient way of creating such objects.
==============
Did I note that the transpose can be indexed, producing 2 element arrays with the right numbers?
In [592]: m.transpose(1,2,0)[1,2]
Out[592]: array([1, 2])
In [593]: m.transpose(1,2,0)[0,1]
Out[593]: array([0, 1])
==================
Since the tolist
for a structured array uses tuples, I could do:
In [598]: a2[:]=m.transpose(1,2,0).copy().view('i,i').reshape(a2.shape).tolist()
In [599]: a2
Out[599]:
array([[(0, 0), (0, 1), (0, 2), (0, 3)],
[(1, 0), (1, 1), (1, 2), (1, 3)],
[(2, 0), (2, 1), (2, 2), (2, 3)]], dtype=object)
and thus avoid the list comprehension. It's not necessarily simpler or faster.