I am well aware there are much better solutions for the particular problem described below (e.g., cor
and rcorr
in Hmisc
, as discussed here). This is just an illustration for a more general R issue I just can't figure out: passing multiple variable names from a character vector to a formula statement within a function.
Assume there is a dataset consisting of numeric variables.
vect.a <- rnorm(n = 20, mean = 0, sd = 1)
vect.b <- rnorm(n = 20, mean = 0, sd = 1)
vect.c <- rnorm(n = 20, mean = 0, sd = 1)
vect.d <- rnorm(n = 20, mean = 0, sd = 1)
dataset <- data.frame(vect.a, vect.b, vect.c, vect.d)
names(dataset) <- c("var1", "var2", "var3", "var4")
A correlation test has to be performed for each possible pair of variables within this data set, using a formula statement of the type ~ VarA + VarB
within the function cor.test
:
for (i in 1:(length(names(dataset))-1)){
for (j in (i+1):length(names(dataset))) {
cor.test(~ names(dataset)[i] + names(dataset)[j], data = "dataset")
}
}
which returns an error: invalid 'envir' argument of type 'character'
I assume a character string is incompatible with the formula statement but which class would be compatible with it? If the entire approach is wrong, please explain why and provide or point to an alternative solution. If the approach is somehow "ugly" or "non-R", please explain why.