I am using R and have a big datesets containing 12,224,433 rows. For every row I want to do a spearman correlation test against one vector and extract P values. The scripts are like this:
pvals <- numeric(nrow(SNP))
for(i in 1:nrow(SNP)) {
fit <- cor.test(vector, as.numeric(SNP[i,c(4:50)]), method='spearman', exact=FALSE)
pvals[i] <- fit$p.value
names(pvals)[i] <- paste(SNP$V1[i], SNP$V2[i])
}
The thing is it takes ages, I kind of calculate already, it took 2 hours to run only the first 70,000 rows. So it can take 200 hours. Is there anyway to speed it up?