I'm completely new to programming and R, but have a dataset that can only be analyzed with a more powerful statistics program such as R.
I have a large but simple dataset consisting of thousands of different groups with multiple samples that I want to compare against the control group with a mann whitney U test, data structure is pictured below.
Group, Measurements
a 0.14534
cont 0.42574
d 0.36347
c 0.14284
a 0.23593
d 0.36347
cont 0.33514
cont 0.29210
b 0.36345
...
The problem comes from that the nature of the test requires that only two groups are designated. However, as I have more than 1 group it does not work.
This is what I have so far and I as you see it does not work in a repeated fashion and only works if I have two groups in my input file.
data1 = read.csv(file.choose(), header=TRUE, stringsAsFactors=FALSE)
attach(data1)
testoutput <- wilcox.test(group ~ measurement, mu=0, alt="two.sided", conf.int=TRUE, conf.level=0.95, paired=FALSE, exact=FALSE, correct=TRUE)
write.table(testoutput$p.value, file="mwUtest.tsv", sep="\t")
How do I do write and loop the test properly for it to test all my groups against my designated control group? I assume the sapply or lapply functions are used before the wilcox.test, but I dont know how.
I'm sorry if this simple question has been brought up before, but I could not find any previous question regarding this specific problem.