I'm writing a code that, from a table of raw data (output of microbiome QIIME analysis), generates a t-test per group of all the rows. Every row has a bacteria and the values corresponding for every sample. The table can be huge, like 80 columns per 400 rows.
Group_phylum_data:
label_Group Bacteria_Firmicutes Archaea_Other Archaea_Euryarchaeota Bacteria_Other
HC 6.771703e-05 0 0.000000000 9.480385e-04
HC 3.362588e-05 0 0.016835356 5.604313e-05
HC 0.000000e+00 0 0.000000000 2.209945e-04
EPI 0.000000e+00 0 0.001121252 2.466755e-04
EPI 0.000000e+00 0 0.000000000 3.335038e-04
So now these are just the first lines with 2 groups (HC and EPI). I want to run a t-test for each bacteria in the columns among the groups. I've found this pairwise_t_test from the rstatix package and it does exactly what I want, returning also the adjusted p-value. Since the groups can be more than 2, I chose this pairwise_t_test because it can handle them and perform the stats for every combination.
pwc1 <- Group_phylum_data %>%
pairwise_t_test(Bacteria_Firmicutes ~ label_Group, p.adjust.method = "bonferroni")
pwc1
The problem is that I can't find a way to make it a loop for entering each bacteria name and obtain a complete table with a bacteria per row and the stats in the corresponding columns, something like
.y. group1 group2 n1 n2 p p.signif p.adj p.adj.signif
<chr> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <chr>
1 Bacteria_Firmicutes EPI HC 46 28 0.82 ns 0.82 ns
2 Archaea_Other EPI HC 46 28 0.453 ns 0.453 ns
which I obtained by manually performing the analysis inserting the bacteria names. I tried to save the names in an array and substitute the single name (in the example, "Bacteria_Firmicutes") with something like names[i] but it doesn't work. Maybe that's a limit of this script, which only works with a specific name... or maybe I did something wrong? Or, is there another and maybe better way to obtain the output I want for this long dataset? Thank you!