I am reading csv data for building model.
I do understand missing values processing so I haved filled them using radiun and zero. And dropped few parameters which are of no interest.
I manually checked csv file applying filter for empty
value. Which ever fields give empty, I tried to fill them. But still I am getting above error.
Here is my code -
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
dataset = pd.read_csv("model__newdata.csv",header = 0)
#Data Pre-processing
data = dataset.drop('shift_location_id',1)
data = data.drop('status',1)
data = data.drop('city',1)
data = data.drop('open_positions',1)
#Find median for features having NaN
median_role_id, median_specialty_id = data['role_id'].median(),data['specialty_id'].median()
median_shift_id = data['shift_id'].median()
median_shift_id = data['specialty_id'].median()
data['shift_id'].fillna(median_shift_id, inplace=True)
data['role_id'].fillna(median_role_id, inplace=True)
data['specialty_id'].fillna(median_specialty_id, inplace=True)
data['years_of_experience'].fillna(0, inplace=True)
data['specialty_id'].fillna(0, inplace=True)
#Start training
labels = dataset.shift_location_id
train1 = data
algo = LinearRegression()
x_train , x_test , y_train , y_test = train_test_split(train1 , labels , test_size = 0.20,random_state =1)
# x_train.to_csv("x_train.csv", sep=',', encoding='utf-8')
# x_test.to_csv("x_test.csv", sep=',', encoding='utf-8')
algo.fit(x_train,y_train)
algo.score(x_test,y_test)
Error:
ValueError Traceback (most recent call last)
<ipython-input-27-99f96096832a> in <module>
32 # x_test.to_csv("x_test.csv", sep=',', encoding='utf-8')
33
---> 34 algo.fit(x_train,y_train)
ValueError: could not convert string to float: 'none'
Any suggestion how to resolve this?
Edit 1 - Sample data - https://gist.githubusercontent.com/karimkhanvi/d69c98352aaaaed87f787a20c05307f8/raw/a45bb471fc1ee5095a1d0c3809a8362c001f639e/temp.csv
Edit 2 - I already checked ValueError: could not convert string to float: id before I posted.
I appreciate if you check that I have not issue with the data type of any parameter.
ValueError: could not convert string to float: 'none'
I am facing issue due to empty
values. And I have tried to deal with this issue which does not solve my problem. That is why I have posted this question.
Edit 3
I tried to check if any value isnull
data.isnull().values.any()
data.isnull().sum()
Which gives false
and
shift_id 0
user_id 0
shift_organization_id 0
shift_department_id 0
role_id 0
specialty_id 0
years_of_experience 0
nurse_zip 0
shifts_zip 0
dtype: int64