Why do I get different stats when using summar(aov) and anova_test in R?

Question

This is an R-stats question. I have data from many subjects. My dependent variable is some blood-measure, let's say white blood count (cont variable). bc = 5.6 My independent variable of interest is group,Dx, (3 levels: controls, depressed, remitted). I want to "correct" for (add covariates), for age (cont) and gender (binary).

This gives me the formula:

myform_aov <- as.formula(sprintf("%s ~ %s + %s + %s", current_bc, "age","gender", "Dx"))

If I feed this formula into

anova <- summary(aov(myform_aov, data = data))

and

res.ancova <- data %>% anova_test(myform_aov)

I get (slightly) different results. Why is this, and which one is more correct to use?

What is the difference between summary(aov()) and anova_test(())?

aov: Dx,p-val: 0.2377 age,p-val: 0.018 gender,p-val: 0.04

anova_test: Dx,p-val: 0.238 age, p-val: 0.014 gender, p-val: 0.06

So one gives 4 decimal places and the other gives 3. So the difference appears to just be from rounding? Without any sort of [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) it's not clear that there is any difference here. It's not even clear where the `anova_test` function comes from as that's not a base R function. What's "correct" to use is really a statistical decision, not a programming one. If you need statistical advice, ask for help at [stats.se] instead. — MrFlick, Jan 17 '23 at 14:20

score 2 · Answer 1 · answered Jan 17 '23 at 14:20

By default, anova_test() is doing a type II test aov() is doing a type I test. You can make anova_test() comparable to aov() by specifying type=1.

library(ggplot2)
library(rstatix)

form <- qsec ~ as.factor(cyl) + hp

anova_test(data=mtcars, form )
#> Coefficient covariances computed by hccm()
#> ANOVA Table (type II tests)
#> 
#>           Effect DFn DFd     F     p p<.05   ges
#> 1 as.factor(cyl)   2  28 0.287 0.753       0.020
#> 2             hp   1  28 9.286 0.005     * 0.249
summary(aov(form, data=mtcars))
#>                Df Sum Sq Mean Sq F value   Pr(>F)    
#> as.factor(cyl)  2  34.61  17.303  10.021 0.000522 ***
#> hp              1  16.03  16.034   9.286 0.004995 ** 
#> Residuals      28  48.35   1.727                     
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova_test(data=mtcars, form, type=1)
#> ANOVA Table (type I tests)
#> 
#>           Effect DFn DFd      F        p p<.05   ges
#> 1 as.factor(cyl)   2  28 10.021 0.000522     * 0.417
#> 2             hp   1  28  9.286 0.005000     * 0.249

^{Created on 2023-01-17 by the reprex package (v2.0.1)}

Why do I get different stats when using summar(aov) and anova_test in R?

1 Answers1