1

I want to convert a unstructured array into structured array. Here is my code

import numpy as np
import numpy.lib.recfunctions as rf

data = np.arange(24).reshape(4,6)
dtype=[
    ("x", "f4"),
    ("y", "f4"),
    ("z", "f4"),
    ("red", "u1"),
    ("green", "u1"),
    ("blue", "u1"),
    ]
output_data = np.array(data, dtype=dtype)
print("output_data: ", output_data)                     

Here is the output

output_data:  [[(  0.,   0.,   0.,  0,  0,  0) (  1.,   1.,   1.,  1,  1,  1)
  (  2.,   2.,   2.,  2,  2,  2) (  3.,   3.,   3.,  3,  3,  3)
  (  4.,   4.,   4.,  4,  4,  4) (  5.,   5.,   5.,  5,  5,  5)]
 [(  6.,   6.,   6.,  6,  6,  6) (  7.,   7.,   7.,  7,  7,  7)
  (  8.,   8.,   8.,  8,  8,  8) (  9.,   9.,   9.,  9,  9,  9)
  ( 10.,  10.,  10., 10, 10, 10) ( 11.,  11.,  11., 11, 11, 11)]
 [( 12.,  12.,  12., 12, 12, 12) ( 13.,  13.,  13., 13, 13, 13)
  ( 14.,  14.,  14., 14, 14, 14) ( 15.,  15.,  15., 15, 15, 15)
  ( 16.,  16.,  16., 16, 16, 16) ( 17.,  17.,  17., 17, 17, 17)]
 [( 18.,  18.,  18., 18, 18, 18) ( 19.,  19.,  19., 19, 19, 19)
  ( 20.,  20.,  20., 20, 20, 20) ( 21.,  21.,  21., 21, 21, 21)
  ( 22.,  22.,  22., 22, 22, 22) ( 23.,  23.,  23., 23, 23, 23)]]

But what I excepted is

output_data: [
[0., 1., 2., 3, 4, 5], 
....
]

How can I achieve this?

hao li
  • 367
  • 2
  • 13
  • `rf.unstructured_to_structured(data,np.dtype(dtype))`? – Paul Panzer Jul 16 '20 at 07:07
  • This function seems to be removed in python3.6.9@PaulPanzer – hao li Jul 16 '20 at 07:14
  • I just demo'ed this in https://stackoverflow.com/questions/62922119/how-to-turn-a-numpy-array-to-a-numpy-object/62922504#62922504. – hpaulj Jul 16 '20 at 07:16
  • This is a `numpy` function. If it's not in your `rf` module, the numpy version may be too old (pre 1.17?). (Python version doesn't make a difference.) The data for a structured array has to be a list of tuples, not a list of lists. My link shows several alternative ways of constructing a structured array. – hpaulj Jul 16 '20 at 07:18
  • Python 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy.lib.recfunctions as rf >>> rf.unstructured_to_structured Traceback (most recent call last): File "", line 1, in AttributeError: module 'numpy.lib.recfunctions' has no attribute 'unstructured_to_structured' However my python version is 3.6.9 – hao li Jul 16 '20 at 07:19

1 Answers1

3

If rf does not work for you try this (as mentioned in the comments, the structured data is an array of tuples structure):

output_data = np.array([tuple(i) for i in data], dtype=dtype)

output:

print("output_data: ", output_data) 
output_data:  [( 0.,  1.,  2.,  3,  4,  5) ( 6.,  7.,  8.,  9, 10, 11)
 (12., 13., 14., 15, 16, 17) (18., 19., 20., 21, 22, 23)]

print(output_data['red'])
[ 3  9 15 21]
Ehsan
  • 12,072
  • 2
  • 20
  • 33
  • For loop is too slow when dealing with large data, is there any fast way? – hao li Jul 16 '20 at 08:15
  • I am not sure. Check out @hpaulj's post. Or I would recommend upgrading your numpy and using `rf`. – Ehsan Jul 16 '20 at 08:17
  • 2
    Most of the `rf` functions use the copy-by-fields approach that I demonstrate. Usually the number of fields is small compared to number of records, so this is relatively fast(er). – hpaulj Jul 16 '20 at 15:14