I'm trying to figure out if there is a straightforward way to create a table of paired t-tests using tidyverse packages. There are already Q&As addressing this topic (e.g., here), but the existing answers all seem pretty convoluted.
Here's a reproducible example showing what I'm trying to accomplish -- a column of variable names, columns with the means for both items in the pair for each variable, and a column of p-values:
library(dplyr)
library(infer)
library(tidyr)
df <- mtcars %>%
mutate(engine = if_else(vs == 0, "V-shaped", "straight"))
v_shaped <- df %>%
filter(engine == "V-shaped") %>%
summarise(across(c(mpg, disp), mean)) %>%
pivot_longer(cols = everything()) %>%
rename(V_shaped = value)
straight <- df %>%
filter(engine == "straight") %>%
summarise(across(c(mpg, disp), mean)) %>%
pivot_longer(cols = everything()) %>%
rename(straight = value)
mpg <- df %>%
t_test(formula = mpg ~ engine, alternative = "two-sided") %>%
select(p_value) %>%
mutate(name = "mpg")
disp <- df %>%
t_test(formula = disp ~ engine, alternative = "two-sided") %>%
select(p_value) %>%
mutate(name = "disp")
p_values <- bind_rows(mpg, disp)
table <- v_shaped %>%
full_join(straight, by = "name") %>%
full_join(p_values, by = "name")
table
#> # A tibble: 2 × 4
#> name V_shaped straight p_value
#> <chr> <dbl> <dbl> <dbl>
#> 1 mpg 16.6 24.6 0.000110
#> 2 disp 307. 132. 0.00000248
Obviously, this is not a good way to address this problem even for two variables, and it certainly does not scale well. But it does illustrate the intended outcome. Is there a way to do this in one pipeline? My actual use case involves many more variables, so -- ideally -- I'd be able to feed a vector of variable names into the pipe.