2

I am trying to preallocate an empty array and at the same time defining the data type with a size of 19x5 using the following code:

import numpy as np
arr=np.empty((19,5),dtype=[('a','|S1'),('b', 'f4'),('c', 'i'),('d', 'f4'),('e', 'f4')])

The result is somewhat unexpected, yielding a 19*5*5 array. However, trying:

arr=np.empty((19,1),dtype=[('a','|S1'),('b', 'f4'),('c', 'i'),('d', 'f4'),('e', 'f4')])

gives the proper length per row (5 fields), which apparently looks like a 1D array.

When I am trying to write this, only this formatting is allowed:

np.savetxt(file, arr, delimiter=',', fmt='%s')

This tells me I am dealing with a single string. Is there no way to get a 19x5 shaped structured array that is not flattened?

The main problem arises when writing this with savetxt. I want to have a csv file that has all the 5 column values. As this is handled as a string it gives the wrong output.

Fourier
  • 2,795
  • 3
  • 25
  • 39
  • You can use pandas DataFrame which is generally better than numpy's structured array. If you are willing to explore that option, say so. I will provide some example based on above question. – Hun Apr 08 '16 at 19:34
  • Thank you @Hun . I looked into this before. Fortunately, I managed to complete the code using numpy's structured arrays. I let you know if help is required for pandas. – Fourier Apr 11 '16 at 12:16

1 Answers1

4

Typically the fields of a structured array replace the columns of a 2d array. Often people load a csv with genfromtxt and wonder why the result is 1d. As you found you can make a 2d array with a compound dtype, but each element will have multiple values - as specified by the dtype.

Normally you'd initialize that array with a 1d shape, e.g. (19,).

Note that you have to fill values by field or with a list of tuples.

I don't have experience using savetxt with a structured array, and can't run tests on this tablet. But there probably are SO questions that help.

savetxt iterates on an array, and writes fmt%tuple(row), where fmt is built from your input.

I'd suggest trying fmt='%s %s. %s. %s %s' - a % format for each field in the dtype. See its docs. Also I don't know if a (19,) array will behave better than a (19,1).

Experiment with formatting elements of your array. They should look like tuples to the formatter. If not try tolist() or tuple(A[0]).

Here's answer that is almost good enough to be a duplicate

https://stackoverflow.com/a/35209070/901925

 ab = np.zeros(names.size, dtype=[('var1', 'S6'), ('var2', float)])
 np.savetxt('test.txt', ab, fmt="%10s %10.3f")

===================

savetxt can only handle a 1d structured array, because of the tuple formatting.

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353