2

I have the following data:

dat <- lapply(1:84, function(l) rnorm(20) )
mat <- matrix(dat, nrow=6, ncol=14)

For each column, I now want to perform a paired statistical test between each pairwise combination of rows. What is the most vectorized, and hence efficient way of doing this so that I can extract a matrix of p values for each column?

Also, what is the best way to display or visualize the resultant pairwise P-values? A matrix? If so, there will be 36-15 cells that are redundant. Perhaps there is a better way?

Kaleb
  • 1,022
  • 1
  • 15
  • 26

1 Answers1

4
set.seed(42)

mat <- matrix(rnorm(84), nrow=6, ncol=14)

res <- combn(seq_len(ncol(mat)), 2, FUN=function(ind) {
  res <- wilcox.test(mat[,ind[1]], mat[,ind[2]], paired=TRUE)$p.value
  c(ind,res)
})

res <- as.data.frame(t(res))
names(res) <- c("i", "j", "p")

#adjust p-values for multiple-testing, e.g., adjusting false discovery rate
res$p <- p.adjust(res$p, method="fdr")

library(ggplot2)
ggplot(res, aes(y=i, x=j, fill=p)) + geom_tile()

enter image description here

Roland
  • 127,288
  • 10
  • 191
  • 288
  • Thanks. The visualization is what I need apart from one thing: how do I get the pvalues within the cells (rounded to a suitable number of sig figs)? – Kaleb Jan 18 '14 at 15:46