R: How to create result table of the results of multiple statistics tests?

Question

I am a complete beginner in R.

I ran multiple Chi-square tests on a column of data in R with this code:

apply(mydata, 2, chisq.test, p=expected.probability)

and got multiple results like this:

$Primary Tumor

Chi-squared test for given probabilities

data: newX[, i] X-squared = 515108, df = 6, p-value < 2.2e-16

$Primary Tumor_1

Chi-squared test for given probabilities

data: newX[, i] X-squared = 583205, df = 6, p-value < 2.2e-16

$Primary Tumor_2

Chi-squared test for given probabilities

data: newX[, i] X-squared = 58089, df = 6, p-value < 2.2e-16

Can extract a results table with Tumour number, x-squared results, df and p-value of 50 samples I tested?

I can copy and paste in excel but I wanna learn code for larger sample.

Thank you:)

score 0 · Answer 1 · answered Feb 25 '20 at 19:22

0

try this

df <- apply(mydata, 2, chisq.test, p=expected.probability)

just assigns it to a variable that can be accessed from your environment... check this question out too it may help you as well. chi square test for each row in data frame

answered Feb 25 '20 at 19:22

neuroandstats

124
11

Thanks for your help! I got it now:) – Kim So Yon Feb 25 '20 at 22:25
accept answers that are correct so others can see as well :) – neuroandstats Feb 25 '20 at 22:33

score 0 · Accepted Answer · answered Feb 25 '20 at 23:32

0

You can see are the names of the chisq test:

names(chisq.test(matrix(1:4,ncol=2)))
[1] "statistic" "parameter" "p.value"   "method"    "data.name" "observed" 
[7] "expected"  "residuals" "stdres"

The values you need are statistic (chisq), parameter (df), p.value.

So we simulate data:

mydata = matrix(rpois(100,50),ncol=10)
colnames(mydata) = paste0("tumor",1:10)

And write a more elaborate function to take out these parameters after the test

res = apply(mydata,2,function(x){
chisq.test(x,p=rep(0.1,10))[c("statistic","parameter","p.value")]
})

And we make it a data.frame:

df = data.frame(id=names(res),do.call(rbind,res))
df

             id statistic parameter   p.value
tumor1   tumor1  4.322896         9 0.8889048
tumor2   tumor2  5.285714         9 0.8087245
tumor3   tumor3  2.803063         9 0.9715936
tumor4   tumor4   8.62578         9 0.4725097
tumor5   tumor5  13.22846         9 0.1525381
tumor6   tumor6  8.653768         9 0.4698283
tumor7   tumor7  7.666667         9 0.5680554
tumor8   tumor8  5.919132         9 0.7479838
tumor9   tumor9  8.051335         9 0.5289813
tumor10 tumor10  13.46875         9 0.1425173

answered Feb 25 '20 at 23:32

StupidWolf

45,075
17
40
72

Thank you for your response! It really helped:) – Kim So Yon Feb 27 '20 at 00:15
Can I ask further question? I ran Chi-squared test for 50 samples, does this mean that I have to use adjusted p-values such as FDR instead of p-value to correct for multiple tests? – Kim So Yon Feb 27 '20 at 00:17
No problem :) Yes in theory you should. Using my example above, you can do p.adjust(df$p.value,"BH"). BH means benjamini hochberg correction. Let me know if you have similar problems along this line – StupidWolf Feb 27 '20 at 08:52
Where should I put p.adjust(df$p.value,"BH")? Instead of normal p value? I tried 'test <-p.adjust(res$p.value,"BH")' but it didn't work. It gives 'numeric(0)'.. Sorry for such a basic question...! – Kim So Yon Feb 28 '20 at 02:51
Hi @KimSoYon, using the example above, the results should be in df; so you do df$padj = p.adjust(df$p.value,"BH") ; and the adjusted p value is in the table – StupidWolf Feb 28 '20 at 07:32

R: How to create result table of the results of multiple statistics tests?

2 Answers2