I have a data frame named granular
that contains, in relevant part:
factor column
GranularClass
, one of whose values is"Constitutional Law I Spring 2016"
, andseveral numeric columns, for example
Knowledge
. The numeric columns contain NAs.
I'm trying to write a function that counts the non-NA values for a given column, conditional on a given factor value. However, my attempt to count the values behaves differently depending on whether I write it as a function or just use it in the console.
More specifically, the following code fails:
# take subset of the dataframe containing only the factor values I want to look at:
isolate <- function(class) {
return(granular[granular$GranularClass == class, ])
}
# count non-NA values:
cr <- function(df, column){
return(sum(!is.na(df$column)))
}
# this fails
cr(isolate("Constitutional Law I Spring 2016"), Knowledge)
That last call gives incorrect output (it just returns 0), and throws a warning:
Warning message:
In is.na(df$column) :
is.na() applied to non-(list or vector) of type 'NULL'
However, this succeeds:
sum(!is.na(isolate("Constitutional Law I Spring 2016")$Knowledge))
# gives correct output: [1] 62
And, so... huh? I believe that the working code in the last block is semantically identical to the function call in the first block that blows up. But obviously that's not right.
Am I somehow passing the column name into the function wrong? (Should I be passing it as a string? But this prior SO suggests you can't pass strings into the $
operator.