How to perform a two-way mixed ANOVA with three groups, three time points and one dependent variable in multiple columns in R

Question

I have extracted time series from a few regions of interest in the brain (fMRI) and I have added pairwise correlation (Fisher-Z) values for each subject under columns corresponding the correlation between two nodes in the brain (for example: stim_lvis3, stim = stimulation site and lvis3= left visual network 3). Now, I would like to perform ANOVAs on this dataset to look at the effects and between/within group differences (3 groups x 3 timepoints). My data is already in long format.

*groups= ctbs [10 subjects x 3 timepoints], itbs = [10 subjects x 3 timepoints], and sham [10 subjects x 3 timepoints]

Any suggestions on how this can be done, given that I have 12 columns with connectivity values (stim_lvis3....stim_rpcc1). for example I have not been able to box plot the data faceted both by time and group?
How to perform a two-way mixed anova in this situation for all 12 columns for each group at a specific timepoints and then compare groups at each timepoint?

data in long format

I converted subject, time and group to factors

tbs %>%
  group_by(time, group) %>%
  get_summary_stats(stim_lVis3, type = "mean_sd")

Error in tbs(.) : could not find function "tbs"

bxp <- ggboxplot(
  tbs, x = "time", y = "stim_lvis3",
  color = "group", palette = "jco"
)

bxp

Error in FUN(X[[i]], ...) : object 'stim_lvis3' not found

score 1 · Accepted Answer · answered Mar 25 '22 at 16:12

Welcome to SO! It's looks like you're new, but in the future to get great answers quickly, make sure you include sample data in a format that useable (i.e., the output from dput(head(myData))). Check it out: making R reproducible questions.

I know of two approaches to completing within and between ANOVA. The easier to implement version is from the package ez. The package jmv offers a much more complex write-up but you have an immense amount of control, as well. I'm sure there is more to ez's version, but I haven't worked with that package very much.

I created data to somewhat simulate what you are working with.

library(tidyverse)
library(jmv)
library(ez)

set.seed(35)
df1 <- data.frame(subject = factor(rep(1:15, each = 3)),
                  time = factor(rep(c(1:3), 15)),
                  group = factor(rep(1:3, each = 15)),
                  stim_IVis3 = rnorm(45, .5, .15),
                  stim_IVis4 = rnorm(45, .51, .125))

head(df1)
summary(df1)

To use ez, it's a pretty straightforward implementation. Although, I couldn't find an option that allowed for multiple dependent variables. Essentially, you would either need to pivot_long or you can use jmv. For this method, you don't get the post hoc comparisons; the effect size is the generalized η².

ezANOVA(df1, dv = stim_IVis3, wid = subject, within = time, 
        between = group, detailed = T)

# $ANOVA
#        Effect DFn DFd         SSn       SSd           F            p p<.05        ges
# 1 (Intercept)   1  12 12.58700023 0.3872673 390.0252306 1.616255e-10     * 0.92579702
# 2       group   2  12  0.05853169 0.3872673   0.9068417 4.297644e-01       0.05483656
# 3        time   2  24  0.05417372 0.6215855   1.0458491 3.668654e-01       0.05096178
# 4  group:time   4  24  0.06774352 0.6215855   0.6539102 6.298074e-01       0.06292379
# 
# $`Mauchly's Test for Sphericity`
#       Effect         W         p p<.05
# 3       time 0.9914977 0.9541232      
# 4 group:time 0.9914977 0.9541232      
# 
# $`Sphericity Corrections`
#       Effect       GGe     p[GG] p[GG]<.05      HFe     p[HF] p[HF]<.05
# 3       time 0.9915694 0.3664558           1.187039 0.3668654          
# 4 group:time 0.9915694 0.6286444           1.187039 0.6298074

Now to use jmv, you will need to pivot the data wider. Your within-subject data needs to be in separate columns. Since time is the factor that represents with the within-subject values in the stim... columns, that's what you need to pivot.

df2 <- df1 %>% 
  pivot_wider(id_cols = c(subject, group),
              names_from = time,
              values_from = starts_with("stim"),
              names_vary = "fastest")

head(df2)
# # A tibble: 6 × 8
#   subject group stim_IVis3_1 stim_IVis3_2 stim_IVis3_3 stim_IVis4_1 stim_IVis4_2 stim_IVis4_3
#   <fct>   <fct>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
# 1 1       1            0.497        0.505        0.601        0.417        0.776        0.660
# 2 2       1            0.584        0.408        0.737        0.454        0.670        0.650
# 3 3       1            0.741        0.399        0.450        0.306        0.665        0.577
# 4 4       1            0.223        0.450        0.350        0.522        0.342        0.417
# 5 5       1            0.717        0.661        0.581        0.601        0.700        0.686
# 6 6       2            0.534        0.780        0.443        0.420        0.565        0.739

Now you're ready for jmv. bs equates to between-subjects; rm equates to repeated measures.

fit = anovaRM(data = df2, 
              ss = "3",                                # type of SS (1, 2, or 3)
              bs = list("group"),                      # between subjects (links with other parameters this way)
              bsTerms = list("group"),
              rm = list(list(label = "stim_IVs",          # within subjects
                             levels = c(names(df2)[3:8]))), # could also write c("stim_IVs3_1", "stim_IVs3_2", "stim_IVs3_3")
              rmCells = list(list(measure = names(df2)[3], # could also write "stim_IVs3_1"
                                  cell = names(df2)[3]),   # the group associated in 'rm' level (I used the column name)
                             list(measure = names(df2)[4],     
                                  cell = names(df2)[4]),
                             list(measure = names(df2)[5],
                                  cell = names(df2)[5]),
                             list(measure = names(df2)[6],
                                  cell = names(df2)[6]),
                             list(measure = names(df2)[7],
                                  cell = names(df2)[7]),
                             list(measure = names(df2)[8],
                                  cell = names(df2)[8])
                             ),    
              rmTerms = list("stim_IVs"),         # groups variable for repeated/within measures
              emMeans = list(list("stim_IVs", "group")),  # all grouping vars in the ANOVA (for the em tables)
              emmPlots = T,                  # show em plots
              emmTables = T,                 # show em tables
              effectSize = c("omega", "partEta"), # multiple options here, see help
              spherTests = T,                # use correction
              spherCorr = c("none", "GG", "HF"),  # multiple options here, see help
              leveneTest = T,                # check homogeneity (p GREATER than .05 is good)
              qq = T,                        # plot normality validation qq plot
              postHoc = list("group",c("group","stim_IVs"),"stim_IVs"),
              postHocCorr = "tukey")         # use TukeyHSD

If you print fit, you're going to get a ton of information. In addition to the content provided by ezANOVA, this provides the post hoc comparisons of each group, time, dependent variable, and the intermix of each; the results of the statistical assumptions: Levene's for homogeneity and a QQ plot of the residuals for normality; and the estimated marginal means table and a plot of those means.

I realize that you have more fields in your data than what I have here. I believe I've provided enough for you to build off of.

I suggest that you determine what you want to learn from the data and choose the algorithm from there.

If you have any questions, let me know.

Thank you for the amazing explanation and tips! – user18578213 Mar 26 '22 at 12:27 — user18578213, Mar 26 '22 at 12:27

How to perform a two-way mixed ANOVA with three groups, three time points and one dependent variable in multiple columns in R

1 Answers1