1

I'm a beginner of R, and I would like to perform ANCOVA in a dataset with over 200 columns of outcome variables. The most important thing for me is to extract both p values and marginal means of these variables. I successfully extracted p values with the help of lapply() function, but when I extracted marginal means I got such error Error in eval(predvars, data, env) : object 'x' not found.

Here I use the built-in dataset "iris" as an example to display my problem.

data("iris")

#load packages that I would use

library(car); library(compute.es); library(effects); library(ggplot2);
library(multcomp); library(pastecs); library(WRS)

#set contrasts for the following ANCOVA tests:

contrasts(iris$Species) <- contr.poly(3)

#perform ANCOVA for multiple outcome variables at the same time (Here I compare multiple outcome variables at different Specie levels, with Petal.Width as the covariate)

list1 <- lapply(iris[, 1:3], function(x) Anova(aov(x ~ Petal.Width  + Species, data = iris), type="III"))
str(list1)

#extract p values of the main tests

pvalues <- stack(lapply(iris[, 1:3], function(x) Anova(aov(x ~ Petal.Width  + Species, data = iris), type="III")[3, 4]))[2:1]

The above code works well, but when I use effect() function to extract marginal means I got error: #extract marginal means

list2 <- lapply(iris[, 1:3], function(x) summary(effect("Species", aov(x ~ Petal.Width + Species, data = iris)), se=TRUE))

Error in eval(predvars, data, env) : object 'x' not found

marginal.means <- stack(lapply(iris[, 1:3], function(x) summary(effect("Species", aov(x ~ Petal.Width + Species, data = iris)), se=TRUE)[[5]][[1]][1]))[2:1]

Error in eval(predvars, data, env) : object 'x' not found

#when I extract the marginal mean of a certain variable (such as Sepal.Length), not use the <lapply(), it works:

marginal.mean1 <- summary(effect("Species", aov(Sepal.Length ~ Petal.Width + Species, data = iris)), se=TRUE)
marginal.mean1

The output:

 Species
    setosa versicolor  virginica 
  5.880113   5.819859   5.830028 

 Lower 95 Percent Confidence Limits
Species
    setosa versicolor  virginica 
  5.490905   5.676927   5.485953 

 Upper 95 Percent Confidence Limits
Species
    setosa versicolor  virginica 
  6.269322   5.962791   6.174102

Due to the over 200 columns of outcome variables, I would like to extract marginal means once rather than extracting them one by one.

Many thanks for your help,

Ella

Ella_may
  • 59
  • 7

1 Answers1

1

You get that error because the function effect() calls update() and tries to re-fit the model, and at that point, it cannot access your x anymore. (Ok maybe I did not explain that too well) You can read this book chapter to know how functions work.

Try to keep everything within the data.frame and instead provide the formula to fit a different variable:

list2 <- lapply(colnames(iris)[1:3], function(x){
anova_fit = aov(reformulate(c("Petal.Width","Species"),x), data = iris)
summary(effect("Species",anova_fit, se=TRUE))
})

As you can see, this can be applied to your other functions as well.

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • Thank you so much! It does work! However, I still got an error when I adjusted the code into stack() function. The code is: `means.all <- stack(lapply(colnames(iris)[1:3], function(x){ anova_fit = aov(reformulate(c("Petal.Width","Species"),x), data = iris) summary(effect("Species",anova_fit, se=TRUE))[[5]][1] }))[2:1]`. The error is: `Error in rep.int(factor(names(x), unique(names(x))), lengths(x)) : invalid 'times' value`. – Ella_may Jun 29 '20 at 01:49
  • Hi @Ella_may, i am not very familiar with stack() or what you want to achieve with this. I believe this is outside the scope of your original question. You can post this as a separate question. Again my apologies, I really don't know what to use stack() for – StupidWolf Jun 29 '20 at 02:01
  • Many thanks, @StupidWolf. The stack function is used to transform data available as separate columns in a data frame or list into a single column. I want to use `stack()` to extract the values of marginal means from the output table of `summary(effect())`. As I wrote in the question descriptions, I successfully extract p values of ancova main test with the help of `stack()` and `lapply`, but failed with the marginal mean extraction. Many thanks, you still pointed out the key point and made me one step closer to solving the problem. – Ella_may Jun 29 '20 at 02:38
  • The `stack()` worked separately, but failed with the combination of `lapply()`. I do not know why this happened. The code with no error is: `anova_fit = aov(reformulate(c("Petal.Width","Species"),"Sepal.Length"), data = iris)`; `means1 <- summary(effect("Species",anova_fit, se=TRUE))[[5]][1]`. – Ella_may Jun 29 '20 at 02:44
  • the error comes about because even though there is an lapply, each element in the list returned is not a vector. see ```str(list2[[1]])``` So this is a list with different elements. which element do you want to "stack" ? I saw someone has answered your question in the new question posted.. I think it's more or less the way i would handle it – StupidWolf Jun 29 '20 at 08:25
  • Yes, that's very similar to what I want. Thank you so much for your help. – Ella_may Jun 29 '20 at 10:53