I have a 2D numpy array, and I'd like to apply a specific dtype
to each column.
a = np.arange(25).reshape((5,5))
In [40]: a
Out[40]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
In [41]: a.astype(dtype=[('width', '<i4'), ('height', '<i4'), ('depth', '<i4'), ('score', '<f4'), ('auc', '<f4')])
I was expecting line 41 to apply the dtype
that I desired, but instead it "upcast" by creating a new axis, replicating the whole array once for each of the dtypes:
Out[41]:
array([[(0, 0, 0, 0.0, 0.0), (1, 1, 1, 1.0, 1.0), (2, 2, 2, 2.0, 2.0),
(3, 3, 3, 3.0, 3.0), (4, 4, 4, 4.0, 4.0)],
[(5, 5, 5, 5.0, 5.0), (6, 6, 6, 6.0, 6.0), (7, 7, 7, 7.0, 7.0),
(8, 8, 8, 8.0, 8.0), (9, 9, 9, 9.0, 9.0)],
[(10, 10, 10, 10.0, 10.0), (11, 11, 11, 11.0, 11.0),
(12, 12, 12, 12.0, 12.0), (13, 13, 13, 13.0, 13.0),
(14, 14, 14, 14.0, 14.0)],
[(15, 15, 15, 15.0, 15.0), (16, 16, 16, 16.0, 16.0),
(17, 17, 17, 17.0, 17.0), (18, 18, 18, 18.0, 18.0),
(19, 19, 19, 19.0, 19.0)],
[(20, 20, 20, 20.0, 20.0), (21, 21, 21, 21.0, 21.0),
(22, 22, 22, 22.0, 22.0), (23, 23, 23, 23.0, 23.0),
(24, 24, 24, 24.0, 24.0)]],
dtype=[('width', '<i4'), ('height', '<i4'), ('depth', '<i4'), ('score', '<f4'), ('auc', '<f4')])
Why did this happen, given that the number of dtypes matches the number of columns (and so I didn't expect upcasting)?
How can I take an existing array in memory and apply per-column dtypes, as I had intended on line 41? Thanks.