0

So I have this code to calcute RMSE from NN predictions, but when I put the same data into an excel sheet and calculate RMSE manually (√[ Σ(y – predictions)2 / n ] I get two completelly different results eg. 0.6679957342326736 calculated by python and 0.426 calculated by excel any idea what I am doing wrong? The only idea I have is that I scale the inputs into the NN so the rmse could be from those scaled inputs althought I think I dont scale the OD_amount because i drop it in the code.

# load model
model = load_model('MCBESTSAVES/best_model45.h5')
# summarize model.
model.summary()

# load the dataset
df = pd.read_csv('data/68_train.csv',  nrows=200)
df_prescaled = df.copy()
df_scaled = df.drop(['OD_amount'], axis=1)
df_scaled = scale(df_scaled)
cols = df.columns.tolist()
cols.remove('OD_amount')
df_scaled = pd.DataFrame(df_scaled, columns=cols, index=df.index)
df_scaled = pd.concat([df_scaled, df['OD_amount']], axis=1)
df = df_scaled.copy()

X = df.loc[:, df.columns != 'OD_amount'] 
y = df.OD_amount

predictions = model.predict(X, batch_size=112)
rmse = np.sqrt(mean_squared_error(y, predictions))
print(str(predictions))
print(str(rmse))

[sample data][1] [1]: https://i.stack.imgur.com/71DmA.png

Benuker
  • 1
  • 1
  • 1
    Sample data might be helpful here. – BigBen Mar 04 '22 at 13:42
  • @BigBen added sample data – Benuker Mar 04 '22 at 14:44
  • It would be helpful to add sample data as a reproducible dataframe, not a screenshot ... see [good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – BigBen Mar 04 '22 at 14:45
  • Excel and Python do produce the same results when using the same RMSE formula and the same data. Either one of your formulas is wrong or you are not using the same data. But you provide neither your formulas nor your data, so it's not possible to provide help. – aerobiomat Mar 07 '22 at 09:27

0 Answers0