5

I am really new with deep learning. I want to do a task which asks: Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

here is my code:

import pandas as pd
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Dense
from sklearn.model_selection import train_test_split

concrete_data = pd.read_csv('https://cocl.us/concrete_data')

n_cols = concrete_data.shape[1]
model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(n_cols-1,)))

model.compile(loss='mean_squared_error',
          optimizer='adam')


y = concrete_data.Cement
x = concrete_data.drop('Cement', axis=1)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.3)

model.fit(xTrain, yTrain, epochs=50)

and now to evaluate mean square error I wrote this :

from sklearn.metrics import mean_squared_error
predicted_y = model.predict(xTest)
mean_squared_error(yTest, predicted_y)

and I got this error:

y_true and y_pred have different number of output (1!=10)

my predicted_y shape is : (309, 10)

I googled it and I really couldn't find an answer to solve this problem. I don't know what is wrong with my code.

user907988
  • 625
  • 1
  • 5
  • 17

3 Answers3

8

Your y_test data shape is (N, 1) but because you put 10 neurons in output layer, your model makes 10 different predictions which is the error.

You need to change the number of neurons in the output layer to 1 or add a new output layer which has only 1 neuron.

The below code probably works for you.

import pandas as pd
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Dense
from sklearn.model_selection import train_test_split

concrete_data = pd.read_csv('https://cocl.us/concrete_data')

n_cols = concrete_data.shape[1]
model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(n_cols-1,)))           
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',
          optimizer='adam')


y = concrete_data.Cement
x = concrete_data.drop('Cement', axis=1)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.3)

model.fit(xTrain, yTrain, epochs=50)
Batuhan B
  • 1,835
  • 4
  • 29
  • 39
3

Actually, what you are trying to check is the mean_squared_error of y_test and the predicted_y

You have to check what your model predict on x_test, which is the prediction :

predicted_y = model.predict(x_test)

Then you can calculate the error:

mean_squared_error(y_test, predicted_y)
theletz
  • 1,713
  • 2
  • 16
  • 22
  • yes I tried this before but it returns me this error: y_true and y_pred have different number of output (1!=10) – user907988 Apr 02 '20 at 09:29
-1
y_pred = model.predict(x_test).sum(axis=1)

Try this, it worked for me

Princejr
  • 146
  • 4