ValueError: could not convert string to float: b'user1'

Question

I am testing and training text dataset but getting this error. CSV file contains texts.

When I run the code, it gives the output:

ValueError: could not convert string to float: b'user1'

and here user1 is a text inside a dataset

Code:

from keras.models import Sequential
from keras.layers.core import Dense
from sklearn.model_selection import train_test_split
import numpy as np


seed = 9
np.random.seed(seed)

dataset = np.loadtxt('E:/7th Semester/FYP/ini/New 
folder/MBAT/DataSet/train_data.csv', delimiter=',', skiprows=1)


X = dataset[:,0:8]
Y = dataset[:,8]

(X_train, X_test, Y_train, Y_test) = train_test_split(X, Y, test_size=0.33, 
random_state=seed)


model = Sequential()
model.add(Dense(8, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(6, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))


model.compile(loss='binary_crossentropy', optimizer='adam', metrics= 
['accuracy'])
model.fit(X_train, Y_train, validation_data=(X_test, Y_test), nb_epoch=100, 
batch_size=5)

scores = model.evaluate(X_test, Y_test)
print ("Accuracy: %.2f%%" %(scores[1]*100))

Complete Traceback error:

File "C:\Users\Lenovo\Anaconda3\lib\site-packages\numpy\lib\npyio.py", line 725, in floatconv
    return float(x)

ValueError: could not convert string to float: b'user1'

Hi @dashti, Can you share the full `Traceback` error? This will help identify which part of code is causing the error. — amanb, Dec 15 '18 at 11:07
hi @amanb , File "C:\Users\Lenovo\Anaconda3\lib\site-packages\numpy\lib\npyio.py", line 725, in floatconv return float(x) ValueError: could not convert string to float: b'user1' — dashti, Dec 15 '18 at 11:25
@dashti The **FULL** backtrace, and please in the question, not as a comment. — Matthieu Brucher, Dec 15 '18 at 11:32
Without a sample of the `csv` text file we can't help. The discussion indicates that this file not only has string columns, but has variable length rows, or missing values. — hpaulj, Dec 15 '18 at 17:05

amanb · Answer 1 · 2018-12-15T12:07:15.013

0

According to the official documentation for numpy, the dtype for the resulting array from numpy.loadtxt() is float. Now, user1 is a string and cannot be converted to float, and therefore you are getting this error. You could try the following:

np.genfromtxt('/path/to/csv', dtype=None, delimiter=',', names=True, case_sensitive=True, invalid_raise=False)

edited Dec 15 '18 at 12:07

answered Dec 15 '18 at 11:39

amanb

5,276
3
19
38

I just added another approach using `np.genfromtxt()`. You could try that. – amanb Dec 15 '18 at 11:48
got this error when tried np.genformtxt() ValueError: Some errors were detected ! Line #5 (got 5 columns instead of 4) Line #8 (got 2 columns instead of 4) Line #11 (got 6 columns instead of 4) – dashti Dec 15 '18 at 11:59
If your csv has mixed dtypes, you should use `dtype=None`, I've edited the answer. Also, the error you are getting is due to the inconsistency detected in the number of columns. This [SO answer](https://stackoverflow.com/questions/23353585/got-1-columns-instead-of-error-in-numpy) suggests a workaround which I'm adding to my answer. – amanb Dec 15 '18 at 12:02
Please refer to suggested answer to resolve column inconsistencies. I've added the argument `invalid_raise=False` to my answer. However, unless we are not sure what type of data exists in the csv, we cannot suggest what may work for you. – amanb Dec 15 '18 at 12:10
i have int and string data in csv file – dashti Dec 15 '18 at 12:23
Y = dataset[:,8] from this line code got this error now. IndexError: index 8 is out of bounds for axis 1 with size 4 – dashti Dec 15 '18 at 12:26
1

This most likely means there are 4 columns, but you have specified index 8 which is out of bounds. Change that to 4. – amanb Dec 15 '18 at 12:41

ValueError: could not convert string to float: b'user1'

1 Answers1