5

I have a structure array in the form of

output = np.zeros(names.size, dtype=[('name', 'U32'), ('r', float),('m',float)])

Then I tried to save it into a csv file using np.savetxt. I am wondering if there is way I could also save the label of each column as the header of the csv file?

Thank you in advance.

somebodyzh
  • 51
  • 1
  • 3
  • 1
    `output.dtype.names` is a list of those field names. You could use that format a header line, e.g. `' '.join(output.dtype.names)` – hpaulj Jun 25 '18 at 16:12
  • @hpaulj Wouldn't that just gives me three column where all my data will appear in the first column? – somebodyzh Jun 26 '18 at 16:08
  • The header line doesn't affect the layout of the data. That's handled by the `fmt` as specified in the answer. To `savetxt`, the header is just a string it writes along with the comment character. – hpaulj Jun 26 '18 at 16:18

3 Answers3

2

You could try a solution similar to this SO answer to pivot the data

dtypes = [('name', 'U32'), ('r', float),('m',float)]
a = np.zeros(5, dtype=dtypes)
b = numpy.vstack(map(list, a))

Where you map list over the recarray tuples, and then vertically stack them.

Then you can do the following...

names = [n for n, t in dtypes]
numpy.savetxt('test.csv', b, header=','.join(names), fmt=','.join(['%s']*b.shape[1]))
ryanjdillon
  • 17,658
  • 9
  • 85
  • 110
0

Below is sample code:

output = np.zeros(names.size, dtype=[('name', 'U32'), ('r', float),('m',float)])
np.savetxt("foo.csv", output, delimiter=",", header="name,r,m", fmt="%s,%f,%f", comments='')

As documented here.

chifu lin
  • 86
  • 5
  • It gives me 3 columns this way, but all the data appears to be in the first column. I am guessing it is because structured array has each element in the form of 'name float float'. Is there a way to seperate each element into 3 columns? – somebodyzh Jun 26 '18 at 16:04
0

I found myself struggling with this problem very frequently so I wrote a function which generates a header and formatting string to use with np.savetxt:

You can find the code on GitHub Gist.

I haven't tested it extensively but it can deal with most data types and generates (optionally) automatically padded output. The output is nicely formatted and human readable, support drag-and-drop into excel can be read back easily as well where the field names and dtypes are autodetected (mostly).

Sample output:

# x     y1       y2 bools   verylongnamewithshortcontent                     bytes                              objects
  0    -25  3.9e+03     1                              a      b'AvvOkBhFJZIUQsxdg'  {'key1': 12423, 'key2': 'asdfjkl;'}
  1    255    8e+03     1                              a                b'SxKvotv'  {'key1': 12423, 'key2': 'asdfjkl;'}
  2   -211  2.5e+03     0                              a              b'tvBQXgqqS'  {'key1': 12423, 'key2': 'asdfjkl;'}
  3   -830  5.7e+02     1                              a        b'NCFrZHfniaZjeUg'  {'key1': 12423, 'key2': 'asdfjkl;'}
  4  -3468  8.7e+03     0                              a          b'RxzuvyKCxKBsz'  {'key1': 12423, 'key2': 'asdfjkl;'}
  5   4644  2.2e+03     1                              a              b'yHijSVfCv'  {'key1': 12423, 'key2': 'asdfjkl;'}
  6  27946    4e+03     0                              a            b'ywyZeQICJrY'  {'key1': 12423, 'key2': 'asdfjkl;'}
  7 313770  3.2e+03     1                              a    b'HBEufqJuASVxHRIxpjd'  {'key1': 12423, 'key2': 'asdfjkl;'}
  8 -76304  7.7e+02     0                              a                     b'UX'  {'key1': 12423, 'key2': 'asdfjkl;'}
  9 427810  8.4e+03     0                              a            b'jmnOEWCvTWg'  {'key1': 12423, 'key2': 'asdfjkl;'}

Input / Output dtypes:

[('x', '<i4'), ('y1', '<i4'), ('y2', '<f8'), ('bools', '?'), ('verylongnamewithshortcontent', '<U7'), ('bytes', 'S20'), ('objects', 'O')]

[('x', '<i4'), ('y1', '<i4'), ('y2', '<f8'), ('bools', '<i4'), ('verylongnamewithshortcontent', '<U1'), ('bytes', '<U22'), ('objects', '<U35')]
xyzzyqed
  • 472
  • 4
  • 12