Post-hoc tests for one-way ANOVA with Welch's correction in R

Question

I have run a one-way ANOVA test with welch's correction using oneway.test() in R, as I have data that violate the assumption of equal variance (transformations did not solve the problem).

A simple data example:

> dput(df)
structure(list(Count = c(13, 14, 14, 12, 11, 13, 14, 15, 13, 
12, 20, 15, 9, 5, 13, 14, 7, 17, 18, 14, 12, 12, 13, 14, 11, 
10, 15, 14, 14, 13), Group = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("a", "b", "c"
), class = "factor")), .Names = c("Count", "Group"), row.names = c(NA, 
-30L), class = "data.frame")

library(car) 
grp = as.factor(c(rep(1, 10), rep(2, 10),rep(3, 10)))
leveneTest(df$Count,grp) #unequal variances

#one-way ANOVA with welch's correction
oneway.test(Count ~ Group, data=df, na.action=na.omit, var.equal=FALSE)

I have multiple groups so I would now like to run pairwise post-hoc tests. Is there anyway to do this with an object from the oneway.test() function? If not, how would one go about running pair-wise tests on groups with unequal variances? I have not been able to find an answer to this question online. Any advice would be appreciated.

In this example, you're wanting to perform the `oneway.test` on a subset of `df` that includes just (for example) the `a` and `b` groups? — r2evans, Feb 18 '15 at 18:03

score 7 · Answer 1 · edited Apr 13 '17 at 12:44

Just to add, despite the bad timing and given than I have been seeking for something similar myself, there is also the option to perform a Games-Howell test. This has even been incorporated under the 'posthoc.tgh' function in the 'userfriendlyscience' R package as introduced in this stackexchange_post. It represents an extension of the Tukey‐Kramer test for unequal variances. posthocTGH {userfriendlyscience}

Original publication (even from before I was born): Paul A. Games and John F. Howell. Pairwise Multiple Comparison Procedures with Unequal N's and/or Variances: A Monte Carlo Study. Journal of Educational & Behavioral Statistics, Vol.1, No. 2, 1976, pp. 113-125. doi: 10.3102/10769986001002113

thank you for the stepping block, and credit to @Matherion in Brussels who edited the function and distributed it. — Ana Maria Mendes-Pereira, Jul 02 '15 at 13:08

score 4 · Accepted Answer · answered Feb 18 '15 at 20:05

Here are two methods:

The Data

library(car) 
df <- structure(list(Count = c(13, 14, 14, 12, 11, 13, 14, 15, 13, 12, 20, 15, 9, 5, 13, 14, 7, 17, 18, 14, 12, 12, 13, 14, 11, 10, 15, 14, 14, 13),
                     Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("a", "b", "c" ), class = "factor")),
                .Names = c("Count", "Group"),
                row.names = c(NA, -30L), class = "data.frame")

Base R

First, the set of unique pairs of the Group factor:

allPairs <- expand.grid(levels(df$Group), levels(df$Group))
## http://stackoverflow.com/questions/28574006/unique-combination-of-two-columns-in-r/28574136#28574136
allPairs <- unique(t(apply(allPairs, 1, sort)))
allPairs <- allPairs[ allPairs[,1] != allPairs[,2], ]
allPairs
##      [,1] [,2]
## [1,] "a"  "b" 
## [2,] "a"  "c" 
## [3,] "b"  "c"

Now the analysis:

allResults <- apply(allPairs, 1, function(p) {
    dat <- df[ df$Group %in% p, ]
    ret <- oneway.test(Count ~ Group, data = dat, na.action = na.omit, var.equal = FALSE)
    ret$groups <- p
    ret
})
length(allResults)
## [1] 3
allResults[[1]]
##  One-way analysis of means (not assuming equal variances)
## data:  Count and Group
## F = 0.004, num df = 1.000, denom df = 10.093, p-value = 0.9508

If you want this is a matrix, perhaps this:

mm <- diag(length(levels(df$Group)))
dimnames(mm) <- list(levels(df$Group), levels(df$Group))
pMatrix <- lapply(allResults, function(res) {
    ## not fond of out-of-scope assignment ...
    mm[res$groups[1], res$groups[2]] <<- mm[res$groups[2], res$groups[1]] <<- res$p.value
})
mm
##           a         b         c
## a 1.0000000 0.9507513 0.6342116
## b 0.9507513 1.0000000 0.8084057
## c 0.6342116 0.8084057 1.0000000

(This can be done just as easily for the F-statistic.)

Using `dplyr`

First, the set of unique pairs of the Group factor:

library(dplyr)
## http://stackoverflow.com/questions/28574006/unique-combination-of-two-columns-in-r/28574136#28574136
allPairs <- expand.grid(levels(df$Group), levels(df$Group), stringsAsFactors = FALSE)  %>%
    filter(Var1 != Var2) %>%
    mutate(key = paste0(pmin(Var1, Var2), pmax(Var1, Var2), sep='')) %>%
    distinct(key) %>%
    select(-key)
allPairs
##   Var1 Var2
## 1    b    a
## 2    c    a
## 3    c    b

If the order really matters, you can add dplyr::arrange(Var1, Var2) early into this pipeline, perhaps after the call to expand.grid.

Now the analysis:

ret <- allPairs %>%
    rowwise() %>%
    do({
        data.frame(.,
                   oneway.test(Count ~ Group, filter(df, Group %in% c(.$Var1, .$Var2)),
                               na.action = na.omit, var.equal = FALSE)[c('statistic', 'p.value')],
                   stringsAsFactors = FALSE)
    })

ret
## Source: local data frame [3 x 4]
## Groups: <by row>
##   Var1 Var2   statistic   p.value
## 1    b    a 0.004008909 0.9507513
## 2    c    a 0.234782609 0.6342116
## 3    c    b 0.061749571 0.8084057

(I'm making no claims as to the performance of either of these; often one will shine with few data like this example, but the other will come out ahead with larger sets. They both appear to perform the same statistical pair-wise comparisons with the same results. Over to you!)

Excellent! Thank you. Really helpful. I had originally been looking for a one-way ANOVA 'tukey's post-hoc' type equivelent for data with unequal variances, but I am guessing from your answer that there is no simple function that does this for ANOVAs with welch's correction? Running pairwise oneway.tests is a good solution. Thanks again. — jjulip, Feb 19 '15 at 09:32
I've just noticed that the 'ret' section of your code does not seem to work (Error: could not find function "%>%"). Thanks. — jjulip, Feb 19 '15 at 09:38
`%>%` is in `dplyr`. The last block with `ret` is dependent on the previous block completing, it is not independent. — r2evans, Feb 19 '15 at 15:29

Post-hoc tests for one-way ANOVA with Welch's correction in R

2 Answers2

The Data

Base R

Using `dplyr`

Linked

Post-hoc tests for one-way ANOVA with Welch's correction in R

2 Answers2

The Data

Base R

Using dplyr

Linked

Using `dplyr`