I aggregate data containing NA
s and therefore I include na.action = NULL
as explained here. Here is the code that works:
# Toy data.
df <- data.frame(x= 1:10, group= rep(1:2, 5), other_var= rnorm(10))
# Aggragate with formula.
aggregate(formula= x ~ group, data= df, na.action= NULL, FUN= function(i) sum(i))
In my situation I can not provide variable names as formula because they can change. Thus, I provide them with a string vecor in x
and by
argument like that:
var_names <- c("x", "group")
aggregate(x= df[ , var_names[1]], by= list(df[ , var_names[2]]), na.action= NULL, FUN= function(i) sum(i))
This results in an error. Interestingly, leaving out na.action= NULL
, e.g. aggregate(x= df[ , var_names[1]], by= list(df[ , var_names[2]]), FUN= function(i) sum(i))
, does not end with an error but returns the expected output. How can I avoid that rows containing NA
s disappear while providing column names as a vetor? I do need to include na.action= NULL
because my real data contains NA
s.