0

I have a large data set and the function cor() doesn't help much to distinguish between high/low correlations.

Maybe someone can show me an example how to add colours or stars (* ** ***) or something to the correlation matrix, so I can easily see significant values?

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
mix
  • 1
  • 3
  • 1
    have a look at the `corrplot`package : you can represent the correlation matrix graphically and show only the significant values – etienne Oct 27 '15 at 10:36
  • 1
    psych package is useful here `?psych::corr.test`. http://stackoverflow.com/questions/26574670/corrplot-shows-insignificant-correlation-coefficients-even-when-insig-blank?answertab=votes#tab-top might be useful to help plot – user20650 Oct 27 '15 at 10:48
  • http://stackoverflow.com/questions/32971990/please-help-to-plot-pairwise-correlation#comment53773848_32971990 has some code to plot significance stars - it does need tidied up a bit – user20650 Oct 27 '15 at 10:55
  • Wow, thanks! That corrplot thing is great. – mix Oct 27 '15 at 10:56

2 Answers2

1

What about a heatmap ?

Imagine mtcars is your dataset.

You can transform the data as explain here

ccor = cor(mtcars[,3:10]) # whatever variables 
cormatrix = arrange( melt(ccor), -abs(value) )

Then you can compute a nice heatmap, as explain here

ggplot(cormatrix, aes(Var1, Var2) ) + geom_tile(aes(fill = value), colour = "white") + scale_fill_gradient(low = "white", high = "steelblue")

You get

enter image description here

Hope this help.

Also you can add the values with + geom_text(aes(fill = cormatrix$value, label = round(cormatrix$value, 1))) according to this.

Community
  • 1
  • 1
giac
  • 4,261
  • 5
  • 30
  • 59
0

You can return the results of your correlations to a data frame, and then you can sort, subset, etc.

library(broom)
library(dplyr)

cor.list <- list(NULL)
length(cor.list) <- length(mtcars)^2

for(i in seq_along(mtcars)){
  for(j in seq_along(mtcars)){
    cor.list[[(i-1)*11 + j]] <- 
      tidy(cor.test(mtcars[, i], mtcars[, j])) %>%
      mutate(x = names(mtcars)[i],
             y = names(mtcars)[j])
  }
}

bind_rows(cor.list)
Benjamin
  • 16,897
  • 6
  • 45
  • 65