0

I am trying to calculate the mean of a subset of my dataframe. However, I have found that my subset returns to be of class 'closure' while I just want it to be a vector. The head of my df may look like:

          Date sulfate nitrate ID
1   2002-01-01      NA      NA  8
2   2002-01-02      NA      NA  8
3   2002-01-03      NA      NA  8
4   2002-01-04      NA      NA  8
5   2002-01-05      NA      NA  8
6   2002-01-06      NA      NA  8

There are non NA values in both "sulfate" and "nitrate" further down the df.

I have tried to subset using freem[pollutant] rather than freem$pollutant. This doesn't seem to make any difference.

pollutantmean <- function(directory, pollutant, id = 1:332) {
    means <- c()
    for(i in id) {
        x <- paste(getwd(), "/", directory, "/", sprintf("%03d", i), ".csv", sep = "")
        freem <- read.csv(x)
        inte <- freem$pollutant
        print(class(frame$pollutant))
        means[i] <- mean(inte, na.rm = TRUE)
    }
    mean(means)
}

I expect this for loop to fill the empty vector means with the means of the subsets of all selected monitors (basically different csv files in my wd)

Leonardo
  • 2,439
  • 33
  • 17
  • 31
Mark
  • 1
  • 1

1 Answers1

0

The pollutantmean() function in the OP fails with the following error:

Error in frame$pollutant : object of type 'closure' is not subsettable

Why?

Line 7 includes the following code:

 print(class(frame$pollutant))

that includes a typo, frame instead of freem. frame() is a function in the graphics package, which has the following consequences.

  1. An R function is also a closure,
  2. Objects of type closure cannot be subset with the $ form of the extract operator

Therefore, R generates the closure error message.

Changing Line 7 to print(class(frame$pollutant)) results in NULL, exposing a second error in the code, use of the $ form of the extract operator with variable substitution in a function.

In this situation, the correct extract operator is [[, because [ will return an object of type list(), which will cause the mean() function to return NA.

inte <- freem[[pollutant]] 

Note that these changes will result in a working version of pollutantmean(), but this version will not pass the quiz in the Johns Hopkins R Programming course on Coursera. Why? The OP code calculates an unweighted mean when the assignment requires a weighted mean.

Since the OP is a homework assignment I won't post a fully corrected pollutantmean() function. As a Community Mentor in the JHU Data Science Specialization I am obligated not to post complete solutions to quizzes or project assignments. Instead I refer the student to Common Mistakes: weighted vs. unweighted means for a detailed walkthrough on the difference between a weighted and an unweighted mean.

Len Greski
  • 10,505
  • 2
  • 22
  • 33