Extract columns using "genfromtxt"

Question

I've already read these 2 questions before asking this one (q1 and q2) but I haven't found any satisfying answer

I need to extract two columns from a 2D-array without using pandas or loadtxt, but with genfromtxt

For now, what I did is:

X = np.genfromtxt('File1.csv', 
                    delimiter='\t', 
                    skip_header=0, skip_footer=0, 
                    names=True , usecols=("Time") )

Y = np.genfromtxt('File1.csv', 
                    delimiter='\t', 
                    skip_header=0, skip_footer=0, 
                    names=True , usecols=("Profit") )

then, using matplotlib I plot Y vs X, result is perfect

Now, I was thinking that I should do it the "right" way and avoid reading twice the array. So I tried the unpack feature:

 X, Y = np.genfromtxt('File1.csv', 
                    delimiter='\t', 
                    skip_header=0, skip_footer=0, 
                    names=True , usecols=("Time", "Profit"), unpack=True )

I get the message: too many values to unpack

Now if I write the previous command with one vector for the output (say Z) without unpacking, the vector Z will contain a tuple that cannot be plotted directly.

Any solution to this simple-looking problem ?

I did, but I'd rather use the names from the header since the position of these columns change between files — SAAD, Dec 10 '13 at 23:47
Sorry I mean, do you experience the problem if you use numbers in usecols? — Roberto, Dec 10 '13 at 23:48

askewchan · Accepted Answer · 2013-12-11T00:07:41.950

When you have more than one named field, you will have a 1-d structured array, like so:

>>> np.genfromtxt('File1.csv', delimiter='\t', names=True, usecols=("Time", "Profit"))
array([(0.0, 1.0), (2.0, 3.0), (3.0, 4.0), (5.0, 6.0)], 
      dtype=[('Time', '<f8'), ('Profit', '<f8')])

You can't unpack a 1d structured array, since all that unpack=True does is to transpose your array so that columns vary along the first axis, and the transpose of a 1d array is itself. Thus, you get the same result with unpack:

>>> np.genfromtxt('File1.csv', delimiter='\t', names=True, usecols=("Time", "Profit"), unpack=True)
array([(0.0, 1.0), (2.0, 3.0), (3.0, 4.0), (5.0, 6.0)], 
      dtype=[('Time', '<f8'), ('Profit', '<f8')])

Even if you use numbers in your usecols argument as @Roberto suggests, you still have the problem because using names=True gives you a structured array if you have more than one field (which is why you didn't notice it with your first attempt).

If you save this as Z, you can plot it like

plt.plot(Z['Time'], Z['Profit'])

or you can split it as you originally asked:

X, Y = Z['Time'], Z['Profit']

Minor, pedantic, semantic quibble: *in general*, structured arrays can be n-dimensional, so saying "*a* structured array is actually 1d" is perhaps a bit misleading. The structured array *returned by `genfromtxt`* is 1-d. — Warren Weckesser, Dec 11 '13 at 00:03

Extract columns using "genfromtxt"

1 Answers1