I have written a code to calculate RMSE error between observed and simulated data. But I want to do this only for the month of January only. The text file has data with date in first column, simulated data in 2nd column and observed data in 3rd column.
The format of data is as below:
DATE cout rout coub cinf
UNITS m3/s m3/s m3/s m3/s
1981-01-01 292.234 305 0 292.234
1981-01-02 293.152 320 0 293.152
1981-01-03 293.985 324 0 293.985
1981-01-04 295.115 308 0 295.115
1981-01-05 296.579 326 0 296.579
1981-01-06 298.266 344 0 298.266
1981-01-07 300.084 342 0 300.084
1981-01-08 301.945 329 0 301.945
1981-01-09 303.747 357 0 303.747
1981-01-10 305.437 351 0 305.437
1981-01-11 306.967 352 0 306.967
1981-01-12 308.281 382 0 308.28
The code below is written to calculate RMSE for entire dataset irrespective of dates:
# Function that returns Root Mean Squared Error
# set the working directory
setwd("D:\\Results\\")
# Get the header 1st line of the data
header <-scan("4001968.txt", nlines=1, what =character())
#Define number of lines to skip, which is 2
y <- read.table("4001968.txt",skip=2,header=F,sep="\t")
# Add the character vector header on as the names component
names(y) <- header
#Function for calculating RMSE
rmse <- function(error)
{
sqrt(mean(error^2))
}
# Convert characater to numeric
y$cout <- as.numeric(as.character(y$cout))
y$rout <- as.numeric(as.character(y$rout))
actual <- y$cout
predicted <- y$rout
# Calculate error
error <- actual - predicted
# Invocation of functions
rmse(error)
The output will be a single value for the month of January only.