uh I'm kinda sure that this is quite easy to solve but I couldn't...
I'm having this error
ValueError: could not convert string to float: '2017-08-31 18:06:36.000000'
The code I'm using is the next:
url = "train_set.csv"
names = ['ID', 'order_status', 'order_products_value', 'order_freight_value', 'order_items_qty', 'order_sellers_qty', 'order_purchase_timestamp', 'order_aproved_at', 'order_estimated_delivery_date', 'order_delivered_customer_date', 'customer_city', 'customer_state', 'customer_zip_code_prefix', 'product_category_name_english', 'product_name_lenght', 'product_description_lenght', 'product_photos_qty', 'target']
dataset = read_csv(url, names=names)
print(dataset.shape)
print(dataset.head(100))
print("-----------------------------------------------------------------")
print(dataset.describe())
dataset.plot(kind='line', subplots=True, layout=[12,2], sharex=False, sharey=False)
pyplot.show()
array = dataset.values
X = array[:,0:17]
y = array[:,17]
print(y)
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1)
models = []
models.append(('LR', LogisticRegression(solver='liblinear', multi_class='ovr')))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('NB', GaussianNB()))
models.append(('SVM', SVC(gamma='auto')))
# Here is where the errors are shown
results = []
names = []
print("---- X - Train ---------")
print(X_train)
print(Y_train)
for name, model in models:
kfold = StratifiedKFold(n_splits=10, random_state=1, shuffle=True)
print(name)
cv_results = cross_val_score(model, X_train, Y_train, cv=kfold.get_n_splits(X_train, Y_train), scoring='accuracy')
results.append(cv_results)
names.append(name)
print('%s: %f (%f)' % (name, cv_results.mean(), cv_results.std()))
Any idea? - I think the error is in the models part... but I'm not sure, I'm quite new working with machine learning...
Edit: The idea is to obtain the Unix time
Thanks a bunch!