I feel this should be something easy, I have looked x the internet, but I keep getting error messages. I have done plenty of analytics in the past but am new to R and programming.
I have a pretty basic function to calculate means x columns of data:
columnmean <-function(y){
nc <- ncol(y)
means <- numeric(nc)
for(i in 1:nc) {
means[i] <- mean(y[,i])
}
means
}
I'm in RStudio and testing it using the included 'airquality' dataset. When I load the AQ dataset and run my function:
data("airquality")
columnmean(airquality)
I get back:
NA NA 9.957516 77.882353 6.993464 15.803922
Because the first two variables in AQ have NAs in them. K, cool. I want to suppress the NAs such that R will ignore them and run the function anyway.
I am reading that I can specify this with na.rm=TRUE, like:
columnmean(airquality, na.rm = TRUE)
But when I do this, I get an error message saying:
"Error in columnmean(airquality, na.rm = TRUE) : unused argument (na.rm = TRUE)"
I'm reading all over the place that I simply need to include na.rm = TRUE and the function will run and ignore the NA values...but I keep getting this error. I have also tried use = "complete" and anything else I can find.
Two Caveats:
I know I can create a vector with is.na and then subset the data, but I don't want that extra step, I just want it to run the function and ignore the missing data.
I know also I can specify IN the function to ignore or not ignore, but I'd like a way to choose to ignore/not ignore on the fly, on a action by action basis, rather than having it be part of the function itself.
Help is appreciated. Thank you, everyone.