-2

I want to write to datafile information about correlation as follows:

*korelacja=cor(p2,d2,method="pearson",use = "complete.obs")
korelacja2=cor(p2,d2,method="kendall",use = "complete.obs")
korelacja3=cor(p2,d2,method="spearman",use = "complete.obs")
dane=paste(korelacja,korelacja2,korelacja3,sep=';')
write(dane,file=nazwa,append=TRUE)*

Results are strange for me - Pearson correlation is very high (always equal one), but Kendall and Spearman is very low. I create scatterplots and I don't see linear correlation.

Mateusz
  • 49
  • 4
  • 7
    I don't think anyone is going to be able to answer this without more information about the data that you're analyzing. Can you provide a sample? It seems likely that something is going wrong, since a Pearson correlation of exactly one would also imply a Spearman correlation of one. – bnaul Aug 26 '11 at 05:39
  • To be more specific: can you at least tell us the results of `str(p2)` and `str(d2)`? If `p2` and `d2` aren't too large, can you show us the results of `dput()`? See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example ... – Ben Bolker Aug 26 '11 at 14:23

1 Answers1

1

It's not hard to replicate this pattern if you have some large outliers in your data that dominate the Pearson correlation but are relatively insignificant in the non-parametric (Kendall/Spearman) approaches. For example, here's a concocted data set with nothing going on except for one large outlier:

> set.seed(1001)
> x <- c(runif(1000),1e5)
> y <- c(runif(1000),1e5)
> cor(x,y,method="pearson")
[1] 1
> cor(x,y,method="kendall")
[1] -0.02216583
> cor(x,y,method="spearman")
[1] -0.03335352

This is consistent with your description so far, although you ought in this case to be able to see the outliers in your scatterplots ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453