-2

I have a numpy structured array where I need to control the data types of the individual elements.

This is my example:

y = array([(True, 144.0), 
           (True, 86.0), 
           (True, 448.0), 
           (True, 76.0), 
           (True, 511.0)], dtype=object)

If I do:

print(y.dtype.fields)

I get back:

None

However, what I wanted was "bool" and "float". If I access the individual elements, such as y[0][0] and y[0][1] I surely see that they are indeed bool and float.
I am super confused with this. Any ideas?

I need this because I use the packge "sciki-survival gradient boosting":https://scikit-survival.readthedocs.io/en/latest/generated/sksurv.ensemble.GradientBoostingSurvivalAnalysis.html#sksurv.ensemble.GradientBoostingSurvivalAnalysis.fit Where the input needs to be a structured array of type "bool" and "float".

JE_Muc
  • 5,403
  • 2
  • 26
  • 41
Kim O
  • 551
  • 1
  • 8
  • 19
  • 1
    what does `y.dtype` give you? – juanpa.arrivillaga Jul 11 '18 at 09:16
  • 1
    It gives `dtype('O')`, ie `np.object_`. This isn't a structured array – Eric Jul 11 '18 at 09:17
  • @Eric How do do I build a structured array then with the same content? – Kim O Jul 11 '18 at 09:21
  • How did you build that array? – Eric Jul 11 '18 at 09:22
  • 2
    @KimO, Looks like you are using the [wrong answer](https://stackoverflow.com/a/51280303/9209546) from your [previous question](https://stackoverflow.com/questions/51279973/python-create-structured-numpy-structured-array-from-two-columns-in-a-dataframe) (or didn't read my comment to that answer). Take one step at a time. Understand each step as you go along. This will help you resolve problems more quickly. – jpp Jul 11 '18 at 09:38

3 Answers3

2

I have a numpy structured array

No you don't:

np.array(..., dtype=object)

You have a numpy object array, which contains tuples.

You can convert it to a structured array with y.astype([('b', bool), ('f', float)])

Eric
  • 95,302
  • 53
  • 242
  • 374
1

When you initialize the structured array, ensure that you specify the data types.

For example:

y = np.array([(True, 144.0), (True, 86.0), (True, 448.0)],
          dtype=[('col_1', 'bool'), ('col_2', 'f4')])

This should work and:

y.dtype.fields 

shows as desired:

mappingproxy({'col_1': (dtype('bool'), 0), 'col_2': (dtype('float32'), 1)})

See the documentation here: https://docs.scipy.org/doc/numpy/user/basics.rec.html

lhay86
  • 696
  • 2
  • 5
  • 18
1

To create a structured array, you must specify the dtype beforehand. If you just use numpy.array with a list-literal of pairs then you will get an array with object dtype. So, you need to do something like:

>>> mytype = np.dtype([('b', bool), ('f',float)])
>>> mytype
dtype([('b', '?'), ('f', '<f8')])

Then pass mytype to the array constructor:

>>> structured = np.array(
...    [(True, 144.0), (True, 86.0),
...     (True, 448.0), (True, 76.0),
...     (True, 511.0), (True, 393.0), 
...     (False, 466.0), (False, 470.0)], dtype=mytype)
>>>
>>> structured
array([( True,  144.), ( True,   86.), ( True,  448.), ( True,   76.),
       ( True,  511.), ( True,  393.), (False,  466.), (False,  470.)],
      dtype=[('b', '?'), ('f', '<f8')])
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172