What is the difference between cor and cor.test in R

Question

I have a data frame that its columns are different samples of an experiment. I wanted to find the correlation between these samples. So the correlation between sample v2 and v3, between sample v2 and v4, .... This is the data frame:

> head(t1)
      V2          V3          V4         V5         V6
1 0.12725011 0.051021886 0.106049328 0.09378767 0.17799444
2 0.86096784 1.263327211 3.073650624 0.75607466 0.92244361
3 0.45791031 0.520207274 1.526476608 0.67499102 0.49817761
4 0.00000000 0.001139721 0.003158557 0.00000000 0.00000000
5 0.13383965 0.098943019 0.099922146 0.13871867 0.09750611
6 0.01016334 0.010187671 0.025410170 0.00000000 0.02369374
> nrow(t1)
[1] 23367

if I run the cor function for this data frame to get the correlation between samples(columns) I get NA for all the samples:

> cor(t1, method= "spearman")
V2 V3 V4 V5 V6
V2  1 NA NA NA NA
V3 NA  1 NA NA NA
V4 NA NA  1 NA NA
V5 NA NA NA  1 NA
V6 NA NA NA NA  1

but if I run this :

> cor.test(t1[,1],t1[,2], method="spearman")$estimate
rho 
0.92394

it is different. Why is this so? What is the correct way of getting correlation between these samples? Thank you in advance.

Roland · Answer 1 · 2013-02-02T11:12:50.090

6

Your data contains NA values.

From ?cor:

If use is "everything", NAs will propagate conceptually, i.e., a resulting value will be NA whenever one of its contributing observations is NA.

From ?cor.test

na.action a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").

On my system:

getOption("na.action")
[1] "na.omit"

Use which(!is.finite(t1)) to search for problematic values and which(is.na(t1)) to search for NA values. cor returns NaN if you have Inf values in your data.

edited Feb 02 '13 at 11:12

answered Feb 02 '13 at 10:56

Roland

127,288
10
191
288

how can I check if my data frame contains NA or not? I think it includes plenty of Inf values as well. Will it also affect? And another question is that I think cor.test is for pairwise correlation, and it needs two parameter for calculating. I think what I should use is cor not cor.test, but I am still not sure if it is the correct function to find the correlation between samples(columns) of a data frame or not. – hora Feb 02 '13 at 11:04
@hora See my edit to the answer and read the help pages. You can use `*apply` functions to do pairwise comparisons with `cor.test`. – Roland Feb 02 '13 at 11:13
Thank @Roland. Now I check my data frame, actually only one row has NA which I think only the values which are related to that row should be NA but not all of them. I also replaced the Inf values but the result is still NA. Actually my question is that why when I use cor.test with the same data set comparing only two samples, the result is not NA. But when I use this "cor" for the whole data frame I get NA. :( – hora Feb 02 '13 at 12:00
@hora we can throw guesses at this all day long. Show us the data or reproduce your problem with a simple reproducible example. – Roman Luštrik Feb 02 '13 at 12:10
@hora You are not correlating rows, but columns. Please try to understand what I have written and read the help pages carefully. – Roland Feb 02 '13 at 12:11
Sorry Roland that I am a bit confused. Ofcourse I am calculating for the columns otherwise the row and column names of the answer for "cor" command above would not be the main data frame column names. Actually I run the command with "use"parameter like: cor(q1,method="spearman",use="pairwise.complete.obs") and now the correlations are not NAs. Do you think it is the correct parameter I should use? @Roman Luštrik how should I provide the data? it is a very big matrix with 23367 rows. Is there the possibility to add files here? I can not find it! – hora Feb 02 '13 at 14:34
@hora you would generally upload a subset of your data to a third party site and link to it. Or you can make a mock example (see http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Roman Luštrik Feb 02 '13 at 16:02

What is the difference between cor and cor.test in R

1 Answers1