I am quite new to nympy and I am trying to read a tab(\t) delimited text file into an numpy array matrix using the following code:
train_data = np.genfromtxt('training.txt', dtype=None, delimiter='\t')
File contents:
38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
30 State-gov 141297 Bachelors 13 Married-civ-spouse Prof-specialty Husband Asian-Pac-Islander Male 0 0 40 India >50K
what I expect is a 2-D array matrix of shape (3, 15)
but with my above code I only get a single row array of shape (3,)
I am not sure why those fifteen fields of each row are not assigned a column each.
I also tried using numpy's loadtxt() but it could not handle type conversions on my data i.e even though I gave dtype=None it tried to convert the strings to default float type and failed at it.
Tried code:
train_data = np.loadtxt('try.txt', dtype=None, delimiter='\t')
Error:
ValueError: could not convert string to float: State-gov
Any pointers?
Thanks