2

With Python, I am using genfromtxt (from numpy) to read in a text file into an array:

y = np.genfromtxt("1400list.txt", dtype=[('mystring','S20'),('myfloat','float')])

Which works okay, except it doesn't seem to read my 2 columns into a 2D array. I am getting:

[('string001', 123.0),('string002', 456.0),('string002', 789.0)]

But I think would like:

[['string001', 123.0],['string002', 456.0],['string002', 789.0]]

I basically want each piece of information as a separate element that I can then manipulate.

Kenny Linsky
  • 1,726
  • 3
  • 17
  • 41
user1551817
  • 6,693
  • 22
  • 72
  • 109

1 Answers1

1

What genfromtxt returns is called a structured array. It gives a 1d array of tuples, each tuple has the dtype that you specified.

These are actually very useful once you learn how to use them. You cannot have a 2d array with floats and strings, but with a structured array, you can!

For example:

import numpy as np
from StringIO import StringIO
s = """string001 123
       string002 456
       string002 789"""
f = StringIO(s)
y = np.genfromtxt(f, dtype=[('mystring', 'S20'), ('myfloat', float)])

Which is what you have so far. Now you can access y in the following fashion. You can use a field name to get a column as a 1d array:

>>> y['mystring']
array(['string001', 'string002', 'string002'], 
  dtype='|S20')

>>> y['myfloat']
array([ 123.,  456.,  789.])

Note that y['myfloat'] gives floats because of the dtype argument, even though in the file they are ints.

Or, you can use an integer to get a row as a tuple with the given dtype:

>>> y[1]
('string002', 456.0)

If you are doing a lot of manipulation of data structures like this, you might want to look into pandas

askewchan
  • 45,161
  • 17
  • 118
  • 134