2

This has been addressed before (here, here and here). I want to add a new field to a structure array returned by numpy genfromtxt (also asked here).

My new problem is that the csv file I'm reading has only a header line and a single data row:

output-Summary.csv:

Wedge, DWD, Yield (wedge), Efficiency
1, 16.097825, 44283299.473156, 2750887.118836

I'm reading it via genfromtxt and calculate a new value 'tl':

test_out = np.genfromtxt('output-Summary.csv', delimiter=',', names=True)
tl = 300 / test_out['DWD']

test_out looks like this:

array((1., 16.097825, 44283299.473156, 2750887.118836),
      dtype=[('Wedge', '<f8'), ('DWD', '<f8'), ('Yield_wedge', '<f8'), ('Efficiency', '<f8')])

Using recfunctions.append_fields (as suggested in the examples 1-3 above) fails over the use of len() for the size 1 array:

from numpy.lib import recfunctions as rfn
rfn.append_fields(test_out,'tl',tl)

TypeError: len() of unsized object

Searching for alternatives (one of the answers here) I find that mlab.rec_append_fields works well (but is deprecated):

mlab.rec_append_fields(test_out,'tl',tl)

C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: MatplotlibDeprecationWarning: The rec_append_fields function was deprecated in version 2.2.
  """Entry point for launching an IPython kernel.
rec.array((1., 16.097825, 44283299.473156, 2750887.118836, 18.63605798),
          dtype=[('Wedge', '<f8'), ('DWD', '<f8'), ('Yield_wedge', '<f8'), ('Efficiency', '<f8'), ('tl', '<f8')])

I can also copy the array over to a new structured array "by hand" as suggested here. This works:

test_out_new = np.zeros(test_out.shape, dtype=new_dt)
for name in test_out.dtype.names:
    test_out_new[name]=test_out[name]
test_out_new['tl']=tl

So in summary - is there a way to get recfunctions.append_fields to work with the genfromtxt output from my single row csv file? I would really rather use a standard way to handle this rather than a home brew..

mrf
  • 23
  • 4

1 Answers1

2

Reshape the array (and new field) to size (1,). With just one line, the genfromtxt is loading the data as a 0d array, shape (). The rfn code isn't heavily used, and isn't a robust as it should be. In other words, the 'standard way' is still bit buggy.

For example:

In [201]: arr=np.array((1,2,3), dtype='i,i,i')
In [202]: arr.reshape(1)
Out[202]: array([(1, 2, 3)], dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])

In [203]: rfn.append_fields(arr.reshape(1), 't1',[1], usemask=False)
Out[203]: 
array([(1, 2, 3, 1)],
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4'), ('t1', '<i8')])

Nothing wrong with the home_brew. Most of the rfn functions use that mechanism - define a new dtype, create a recipient array with that dtype, and copy the fields over, name by name.

hpaulj
  • 221,503
  • 14
  • 230
  • 353