I have a huge dataset with 679 rows and 16 columns with 30 % of missing values. So I decided to impute this missing values with the function impute.knn from the package impute and I got a dataset with 679 rows and 16 columns but without the missing values.
But now I want to check the accuracy using the RMSE and I tried 2 options:
- load the package
hydroGOF
and apply thermse
function sqrt(mean (obs-sim)^2), na.rm=TRUE)
In two situations I have the error: errors in sim .obs: non numeric argument to binary operator.
This is happening because the original data set contains an NA
value (some values are missing).
How can I calculate the RMSE if I remove the missing values? Then obs
and sim
will have different sizes.