I am looking to apply t-tests on many columns in a dataset split by factor using R, I found a solution here: Apply t-test on many columns in a dataframe split by factor
This code is taken from the above question:
df <- read.table(text="Group var1 var2 var3 var4 var5
1 3 5 7 3 7
1 3 7 5 9 6
1 5 2 6 7 6
1 9 5 7 0 8
1 2 4 5 7 8
1 2 3 1 6 4
2 4 2 7 6 5
2 0 8 3 7 5
2 1 2 3 5 9
2 1 5 3 8 0
2 2 6 9 0 7
2 3 6 7 8 8
2 10 6 3 8 0", header = TRUE)
t(sapply(df[-1], function(x)
unlist(t.test(x~df$Group) [c("estimate","p.value","statistic","conf.int")])))
The result:
estimate.mean in group 1 estimate.mean in group 2 p.value statistic.t conf.int1 conf.int2
var1 4.000000 3.000000 0.5635410 0.5955919 -2.696975 4.696975
var2 4.333333 5.000000 0.5592911 -0.6022411 -3.104788 1.771454
var3 5.166667 5.000000 0.9028444 0.1249164 -2.770103 3.103436
var4 5.333333 6.000000 0.7067827 -0.3869530 -4.497927 3.164593
var5 6.500000 4.857143 0.3053172 1.0925986 -1.803808 5.089522
This is exactly what I was after, however my dataset also includes categorical data, such as sex, and diagnosis (which includes multiple possibilities).
Is there a way to incorporate this into the above code? I am new to stats but I believe a chi square is used to test the difference between categorical data?
If this cannot be incorporated into the previous code, then a separate code to test the categorical data and produce a similar result would also be a great help.
Any help would be greatly appreciated.
Thanks, Tom
EDIT:
Thanks for your replies.
I am working with transplant data, I am looking to compare outcomes between on/off bypass at surgery. I am not quite sure the best way to show my data, I have copied this from a csv. file, hopefully it comes across okay.
Group,Age,Sex,Height,Weight,Diagnosis,Blood loss,Intubation time,Survival
On bypass,59,Male,165,102,Diagnosis 1,57,53,29
On bypass,44,Female,164,140,Diagnosis 1,114,15,35
On bypass,45,Male,165,119,Diagnosis 2,118,31,81
On bypass,26,Male,178,125,Diagnosis 1,171,36,31
On bypass,41,Female,177,105,Diagnosis 1,76,53,91
On bypass,43,Male,161,119,Diagnosis 3,97,38,63
Off bypass,53,Female,164,139,Diagnosis 1,125,49,51
Off bypass,26,Female,165,137,Diagnosis 3,29,7,86
Off bypass,30,Male,174,121,Diagnosis 1,174,43,100
Off bypass,59,Female,174,133,Diagnosis 1,40,16,43
Off bypass,63,Male,172,132,Diagnosis 2,32,46,10
I was planning to first ensure there is no significant difference between my two groups in terms of age, sex, height, weight and diagnosis.
I was then going to test the outcomes of the patients, including blood loss, intubation time and survival.
Could anyone advise the best test to use for this analysis? And if possible provide some help with the code to run this on R?
Thanks again, Tom