4

I'd like to do a pairwise comparison post-hoc test on Levene's test in R. I know how to do it in SAS using PROC GLM but I can't seem to figure out how to do it in R. Does anyone have any idea? In the example below I'd like to be able to test the homogeneity of the variance between all levels of "cat" i.e. A-B, A-C, A-D, B-C, B-D, C-D. The best way I've found is to subset my data to each of those pairs, then run a Levene's test for each subset, then do a Bonferroni correction at the end. However, this isn't a practical solution when my number of factors becomes large.

library(car)
dat <- rnorm(100, mean=50, sd=10)
cat <- rep(c("A", "B", "C","D"), each=25)
df <- data.frame(cat,dat)
df$cat <- as.factor(df$cat)

LT <- leveneTest(dat ~ cat, data = df)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Nathan
  • 323
  • 3
  • 13
  • 1
    can you show us how you would do it in `PROC GLM`/point us to the place in the online SAS documentation where it describes this option in SAS? – Ben Bolker Apr 27 '17 at 01:29
  • I don't actually have SAS anymore to run it on, thus needing the R. However, I believe in editing my title you've taken away from the fact that my question relates to R... not any other language. I only mentioned SAS because in some old notes I have something along the lines of the below: PROC GLM data=dataset; MEANS classvar/HOVTEST=LEVENE / tukey; – Nathan Apr 27 '17 at 13:59
  • I don't actually have SAS anymore to run it on, thus needing the R. However, I believe in editing my title you've taken away from the fact that my question relates to R... not any other language. I only mentioned SAS because in some old notes I have something along the lines of the below: PROC GLM data=dataset; class classvar; model Y=classvar; MEANS classvar/HOVTEST=LEVENE / tukey; Also, in editing my code I feel as though you've made it less readable... in the future, if you wouldn't mind, I wouldn't assume that the way that's easiest for you is easiest for everyone. – Nathan Apr 27 '17 at 14:08
  • (1) It is standard SO policy that information in tags (such as "in R") should not also be in the title: https://meta.stackexchange.com/questions/195741/discourage-use-of-tags-in-title – Ben Bolker Apr 28 '17 at 00:33
  • (2) Indenting: sorry, you're welcome to roll back my changes ... – Ben Bolker Apr 28 '17 at 00:33
  • (3) I asked about SAS because as far as I can tell `HOVTEST=LEVENE` and `/ TUKEY` are separate specifications; first says to do an overall test of homo/heterogeneity of variance, second says to test means pairwise. *If* SAS had a built-in pairwise test for homog of variances (which is not impossible, but which at this point I think is unlikely) then we could read the description and see what needs to be implemented. Otherwise we have to start from scratch, although [this](http://support.minitab.com/en-us/minitab/17/Multiple_Comparisons_Method_Test_for_Equal_Variances.pdf) looks useful ... – Ben Bolker Apr 28 '17 at 00:40
  • Thanks for letting me know. I guess with the tags, the inclusion of R in the title is redundant information. Thanks for the link. I'm in the process of implementing the LC test. However, it appears as if the "W50" mentioned in the article is just levene's test with a Bonferroni correction. – Nathan May 01 '17 at 15:20

1 Answers1

7

Because a Levene test is simply an ANOVA conducted on sample variance (residuals) instead of sample means, you can calculate the residuals manually, then run the ANOVA with a TukeyHSD test as a post-hoc.

First, multiple comparisons as the title suggests: Using your example, with an extra factor (cat2) so we can do an interaction as well:

df <- df %>% group_by(cat, cat2) %>% 
  mutate(dat.med = ifelse(dat,median(Ctmax, na.rm=TRUE), ifelse(dat==NA, NA)))

The code above skips NA values and calculates sample medians for each factor combination, putting them in a new column (dat.med) in the dataset.

Then we calculate the residuals, changing them to absolute values and putting them in another column:

df$dat.med.res<-abs(df$dat-df$dat.med)

# Then we run an ANOVA, and post-hoc if necessary:
levene.dat.aov<-aov(dat.med.res~cat*cat2,df)
summary(levene.dat.aov)
TukeyHSD(levene.dat.aov)

To add in repeated measures, change the anova to:

 aov(dat.med.res~cat+Error(Subject/(cat)),df)

For a pairwise comparison with a two-level factor (using package PairedData):

levene.var.test(df$dat[df$cat=="A"], df$dat[df$cat=="B"],location=c("median")) 
Grubbmeister
  • 857
  • 8
  • 25