1

I am a complete beginner in R.

I ran multiple Chi-square tests on a column of data in R with this code:

apply(mydata, 2, chisq.test, p=expected.probability)

and got multiple results like this:

$Primary Tumor

Chi-squared test for given probabilities

data: newX[, i] X-squared = 515108, df = 6, p-value < 2.2e-16

$Primary Tumor_1

Chi-squared test for given probabilities

data: newX[, i] X-squared = 583205, df = 6, p-value < 2.2e-16

$Primary Tumor_2

Chi-squared test for given probabilities

data: newX[, i] X-squared = 58089, df = 6, p-value < 2.2e-16

Can extract a results table with Tumour number, x-squared results, df and p-value of 50 samples I tested?

I can copy and paste in excel but I wanna learn code for larger sample.

Thank you:)

Greg
  • 3,570
  • 5
  • 18
  • 31
Kim So Yon
  • 85
  • 7

2 Answers2

0

try this

df <- apply(mydata, 2, chisq.test, p=expected.probability)

just assigns it to a variable that can be accessed from your environment... check this question out too it may help you as well. chi square test for each row in data frame

neuroandstats
  • 124
  • 11
0

You can see are the names of the chisq test:

names(chisq.test(matrix(1:4,ncol=2)))
[1] "statistic" "parameter" "p.value"   "method"    "data.name" "observed" 
[7] "expected"  "residuals" "stdres" 

The values you need are statistic (chisq), parameter (df), p.value.

So we simulate data:

mydata = matrix(rpois(100,50),ncol=10)
colnames(mydata) = paste0("tumor",1:10)

And write a more elaborate function to take out these parameters after the test

res = apply(mydata,2,function(x){
chisq.test(x,p=rep(0.1,10))[c("statistic","parameter","p.value")]
})

And we make it a data.frame:

df = data.frame(id=names(res),do.call(rbind,res))
df

             id statistic parameter   p.value
tumor1   tumor1  4.322896         9 0.8889048
tumor2   tumor2  5.285714         9 0.8087245
tumor3   tumor3  2.803063         9 0.9715936
tumor4   tumor4   8.62578         9 0.4725097
tumor5   tumor5  13.22846         9 0.1525381
tumor6   tumor6  8.653768         9 0.4698283
tumor7   tumor7  7.666667         9 0.5680554
tumor8   tumor8  5.919132         9 0.7479838
tumor9   tumor9  8.051335         9 0.5289813
tumor10 tumor10  13.46875         9 0.1425173
StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • Thank you for your response! It really helped:) – Kim So Yon Feb 27 '20 at 00:15
  • Can I ask further question? I ran Chi-squared test for 50 samples, does this mean that I have to use adjusted p-values such as FDR instead of p-value to correct for multiple tests? – Kim So Yon Feb 27 '20 at 00:17
  • No problem :) Yes in theory you should. Using my example above, you can do p.adjust(df$p.value,"BH"). BH means benjamini hochberg correction. Let me know if you have similar problems along this line – StupidWolf Feb 27 '20 at 08:52
  • Where should I put p.adjust(df$p.value,"BH")? Instead of normal p value? I tried 'test <-p.adjust(res$p.value,"BH")' but it didn't work. It gives 'numeric(0)'.. Sorry for such a basic question...! – Kim So Yon Feb 28 '20 at 02:51
  • Hi @KimSoYon, using the example above, the results should be in df; so you do df$padj = p.adjust(df$p.value,"BH") ; and the adjusted p value is in the table – StupidWolf Feb 28 '20 at 07:32