0

I have a data.csv file containing 5 columns. now I want to use for loop to calculate moving average with a window size of 10 using 5th column in R.

for(i in 1: length[ , 5]-9) {  
 data$Mean [i] <- mean (data[i,5]:data[(i+9),5]) 
} 

This means to calculate the mean of 10 consecutive rows and store them in the same data frame by creating a new column

It is calculating wrong and I am getting the error as follows.....

Error in data[i, 5]:data[(i + 9), 5] : NA/NaN argument

Jaap
  • 81,064
  • 34
  • 182
  • 193
Kumar
  • 169
  • 1
  • 16

1 Answers1

1

You have made a couple of mistakes in the line that calculates the mean. The most significant one is that when you try to calculate the mean for the last 9 rows of your dataframe, you go out of bounds. ie, if your dataframe has 100 rows, by row 92 your are trying to get the mean of rows 92:101; of course, there is no row 101.

It should be something like this:

for(i in 1: length(data[ , 5]-9)) {  
    data$Mean [i] <- mean(data[i:min(i+9, nrow(data)),5]) 
}

Also, it's generally a bad idea to use data as a variable name, since there already is a data() function in base R. Simply choose a similar name, like "mydata"

A reproducible example follows, that will get the mean of the next ten rows, OR the mean of the n next rows for the last 9 rows.

mydata <- data.frame(col_1 = rnorm(100),
                     col_2 = rnorm(100),
                     col_3 = rnorm(100),
                     col_4 = rnorm(100),
                     col_5 = rnorm(100))


for(i in 1: length(mydata[ , 5]-9)) {  
    mydata$Mean [i] <- mean(mydata[i:min(i+9, nrow(mydata)),5]) 
}

head(mydata)

If you dont' want to get the mean for the last ten rows, do this instead:

for(i in 1: length(mydata[ , 5]-9)) {  
    mydata$Mean [i] <- ifelse( i + 9 <= nrow(mydata),
                               mean(mydata[i:min(i+9, nrow(mydata)),5]),
                               NA)
}
HAVB
  • 1,858
  • 1
  • 22
  • 37