I have traffic data and I want to predict number of vehicles for the next hour by showing the model these inputs: this hour's number of vehicles and this hour's average speed value. Here is my code:
dataset=pd.read_csv('/content/final - Sayfa5.csv',delimiter=',')
dataset=dataset[[ 'MINIMUM_SPEED', 'MAXIMUM_SPEED', 'AVERAGE_SPEED','NUMBER_OF_VEHICLES','1_LAG_NO_VEHICLES']]
X = np.array(dataset.iloc[:,1:4])
L = len(dataset)
Y = np.array([dataset.iloc[:,4]])
Y= Y[:,0:L]
Y = np.transpose(Y)
#scaling with MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(X)
X = scaler.transform(X)
scaler.fit(Y)
Y = scaler.transform(Y)
print(X,Y)
X_train , X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.3)
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
mlp = MLPRegressor(activation='logistic')
mlp.fit(X_train,Y_train)
predictions = mlp.predict(X_test)
predictions1=mlp.predict(X_train)
print("mse_test :" ,mean_squared_error(Y_test,predictions), "mse_train :",mean_squared_error(Y_train,predictions1))
I got good mse values such as mse_test : 0.005467816018933008 mse_train : 0.005072774796622158
But I am confused in two point:
Should I scale y values, I read so many blog written that one should not to scale Ys, only scale the X_train and X_test. But I got so bad mse scores such as 49,50,100 or even more.
How can I get predictions for the future but not scaled values. For example I wrote:
Xnew=[[ 80 , 40 , 47],
[ 80 , 30, 81],
[ 80 , 33, 115]]
Xnew = scaler.transform(Xnew)
print("prediction for that input is" , mlp.predict(Xnew))
But I got scaled values such as : prediction for that input is [0.08533431 0.1402755 0.19497315]
It should have been like this [81,115,102]
.