I'm trying to make a table that outputs the summary statistics for a large study that we usually analyze by 2-way anova, looking at main effects of both variables as well as an interaction.
I'd like a way to run the stats quickly, and output them in a format that is easy to read, and if had nice formatting that would be even better.
I've been able to get both the 2-way anova output, and I've also used gtsummary package and tbl_summary
to make a table. However, I can't figure out how to group by more than 1 variable. My solution has been to create a new variable that combines the two independent variables, just to split data into the right groups.
Reproducible Example Below.
I'm wondering if there's a way to make a table with the mean(sem) output as I have it, but to have the results of my 2-way anova (also pasted below). In this titanic example, I'd like a column for P-value for main effect of "Sex", next column for p-value main effect of "Embarked", then a p value for the interaction.
Any thoughts?
library(titanic)
library(tidyverse)
library(gtsummary)
library(plotrix) #has a std.error function
##I really want to look at a 2-way anova, looking for the p-value for Sex, Embarked, and their interaction.
#This code just allows me to make a table with the 4 columns I want, but of course it now won't do the correct stats.
df <- titanic_train %>%
filter(Embarked != "C" & Embarked != "") %>%
mutate(grp = paste(Sex, Embarked)) #add a new column that combines Sex & Pclass
#code to make my table
table1 <- df %>%
select(grp, Age, Fare, Survived) %>%
tbl_summary(
by = grp, ##can't figure out a way to put 2 variables here (Sex & Embarked)
missing = "ifany",
statistic = all_continuous() ~ "{mean} ({std.error})",
digits = all_continuous() ~ 1) %>% #this puts 1 decimal place for all values
modify_header(stat_by = md("**{level}**<br>N = {n}")) %>%
bold_labels() %>%
modify_spanning_header(all_stat_cols() ~ "**These are the Columns I Want**") %>%
add_p(test = everything() ~ "aov", ##This is a 1-way ANOVA, but I need 2 variables
)
table1
#these are the p-values I want in my table:
two_way_anova_age <- aov(Age ~ Sex * Embarked, data = df)
summary(two_way_anova_age)
two_way_anova_fare <- aov(Fare ~ Sex * Embarked, data = df)
summary(two_way_anova_fare)
two_way_anova_surv <- aov(Survived ~ Sex * Embarked, data = df)
summary(two_way_anova_surv)