There is a data frame x with 5753 observations of 4 variables.
The column names are: date, Depth, var1, and var2. I converted date and Depth to factor before performing aggregate().
I wanted to calculate average and standard deviation to 2 variables with grouping by date and Depth.
When applying aggregate(x[,3:4], by = list(x$date, x$Depth), FUN = function(x) c(avg = mean(x, na.rm = TRUE), SD= sd))
, I got average of var1 and average of var 2 grouping by date and Depth, but I did not get SD.
When applying aggregate(. ~ date+Depth, data = x, FUN = function(x) c(avg = mean(x, na.rm = TRUE), SD= sd))
, I got an error message: "Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) : no rows to aggregate".
After counting NA in two column, I found out that there are 5622 NA in var1, 5049 NA in var2. I donot want to remove NA before applying aggregate() yet.
My questions are:
why I did not get sd by applying the first syntax?
why is the second syntax not workable? I learned this syntax from stackoverflow, and it worked with the following data frame,
x3 <- read.table(text = " id1 id2 val1 val2 1 a x 1 9 2 a x 2 4 3 a y 3 NA 4 a y 4 NA 5 b x 1 NA 6 b y 4 NA 7 b x 3 9 8 b y 2 8", header = TRUE)