In addition to @akrun's answer, one can also use the pollutant
argument in the pollutantmean()
function directly within the extract operator. This avoids the need for conditional logic to assign a column number that was included in the original question.
We'll use the first 20 non-missing observations from sensor 001 for the pollutantmean()
assignment, and illustrate multiple forms of the extract operator.
data <- structure(list(Date = c("2003-10-06", "2003-10-12", "2003-10-18",
"2003-10-24", "2003-10-30", "2003-11-11", "2003-11-17", "2003-11-23",
"2003-11-29", "2003-12-05", "2003-12-11", "2003-12-23", "2003-12-29",
"2004-01-04", "2004-01-10", "2004-01-22", "2004-01-28", "2004-02-03",
"2004-02-09", "2004-02-21"), sulfate = c(7.21, 5.99, 4.68, 3.47,
2.42, 1.43, 2.76, 3.41, 1.3, 3.15, 2.87, 2.27, 2.33, 1.84, 7.13,
2.05, 2.05, 2.58, 3.26, 3.54), nitrate = c(0.651, 0.428, 1.04,
0.363, 0.507, 0.474, 0.425, 0.964, 0.491, 0.669, 0.4, 0.715,
0.554, 0.803, 0.518, 1.4, 0.979, 0.632, 0.506, 0.671), ID = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L)), row.names = c(279L, 285L, 291L, 297L, 303L, 315L,
321L, 327L, 333L, 339L, 345L, 357L, 363L, 369L, 375L, 387L, 393L,
399L, 405L, 417L), class = "data.frame")
mean(data[["sulfate"]],na.rm=TRUE)
mean(data[,"nitrate"],na.rm=TRUE)
...and the output:
> mean(data[["sulfate"]],na.rm=TRUE)
[1] 3.287
> mean(data[,"nitrate"],na.rm=TRUE)
[1] 0.6595
>
Applying this approach to the extract operator within the pollutantmean()
function, the code would look like this:
pollutantmean <- function(directory,pollutant, id=001:332){
# read the files, given sensor IDs
data <- # code goes here
mean(data[[pollutant]],na.rm = TRUE)
}