1

I have a numpy structured array with dtype that looks something like this:

In [20]: rectype = np.dtype([
   ....:         ('id',   '<i4'), # int
   ....:         ('price','<f4'), # float
   ....:         ('flag', 'a1'),  # char
   ....:         ('n',    'u1'),  # unsigned char
   ....:         ('r',    'i2'),  # short
   ....:         ('name', 'a4')   # char[4]
   ....:     ])

I'd like to process with pandas and then get back a modified ndarray to be loaded into memory on an embedded device. As has been mentioned here already, pandas changes the dtype of the char types to object, so the resulting array is not compatible with the input:

In [21]: nda = np.fromiter([(1, 14.6, 'a', 0, 1, 'car')], dtype=rectype)
In [22]: a2 = pd.DataFrame.from_records(nda).to_records(index=False)
In [23]: a2.dtype
Out[23]: dtype([('id', '<i4'), ('price', '<f4'), ('flag', 'O'), ('n', 'u1'), ('r', '<i2'), ('name', 'O')])
In [24]: rectype.itemsize, a2.dtype.itemsize
Out[24]: (16, 27)

This is of course not very useful. In my case, the length of the string is fixed and I need it that way to fit it into the data structure. Is there any simple, efficient way to get back an array with the exact same data structure/dtype as that with which I started?

Community
  • 1
  • 1
Aryeh Leib Taurog
  • 5,370
  • 1
  • 42
  • 49

1 Answers1

4

call astype():

pd.DataFrame.from_records(nda).to_records(index=False).astype(rectype)
HYRY
  • 94,853
  • 25
  • 187
  • 187
  • Thanks! When you phrase it that way it's really a numpy question. This is a satisfactory solution. Still, I would have thought that a pandas round-trip should be an identity transformation. – Aryeh Leib Taurog Mar 26 '14 at 12:29