5

I have the following numpy structured array:

x = np.array([(22, 2, -1000000000.0, [1000,2000.0]), (22, 2, 400.0, [1000,2000.0])],
dtype=[('f1', '<i4'), ('f2', '<i4'), ('f3', '<f4'), ('f4', '<f4',2)])

As you can see, field 'f4' is a matrix:

In [63]: x['f4']
Out[63]: 
array([[ 1000.,  2000.],
       [ 1000.,  2000.]], dtype=float32)

My end goal is to have a numpy structured array that only has vectors. I was wondering how to split 'f4' into two fields ('f41' and 'f42') where each field represents the column of the matrix.

In [67]: x
Out[67]: 
array([(22, 2, -1000000000.0, 1000.0, 2000.0),
       (22, 2, 400.0, 1000.0, 2000.0)], 
      dtype=[('f1', '<i4'), ('f2', '<i4'), ('f3', '<f4'), ('f41', '<f4'), ('f42', '<f4')])

Also i was wondering if it was possible to achieve this while using operations that modify the array in place or with minimal copying of the original data.

snowleopard
  • 717
  • 8
  • 19

1 Answers1

3

You can do this by creating a new view (np.view) of the array, which will not copy:

import numpy as np

x = np.array([(22, 2, -1000000000.0, [1000,2000.0]),
              (22, 2, 400.0, [1000,2000.0])],
             dtype=[('f1', '<i4'),
                    ('f2', '<i4'),
                    ('f3', '<f4'),
                    ('f4', '<f4', 2)])
xNewView = x.view(dtype=[('f1', '<i4'),
                         ('f2', '<i4'),
                         ('f3', '<f4'),
                         ('f41', '<f4'),
                         ('f42', '<f4')])
print(np.may_share_memory(x, xNewView)) # True
print(xNewView)
# array([(22, 2, -1000000000.0, 1000.0, 2000.0),
#        (22, 2, 400.0, 1000.0, 2000.0)], 
#       dtype=[('f1', '<i4'),  ('f2', '<i4'), ('f3', '<f4'),
#              ('f41', '<f4'), ('f42', '<f4')])

print(xNewView['f41'])           # array([ 1000.,  1000.], dtype=float32)
print(xNewView['f42'])           # array([ 2000.,  2000.], dtype=float32)

xNewView can then be used instead of x.

jotasi
  • 5,077
  • 2
  • 29
  • 51