1

This is related to this question about ggplot changing the order of variable names. The answer given there, which manually enforces a bijective reordering of factor levels, does not solve the problem in general. An example is the output of melt(matcor(data.A,data.B)$XYcor), which repeats the rows twice:

data.A <- as.data.frame(matrix(runif(c(25)),nrow=5,ncol=5))
data.B <- as.data.frame(matrix(runif(c(25)),nrow=5,ncol=5))

library(ggplot)
library(CCA)
qplot(x=Var1, y=Var2, data=melt(matcor(data.A,data.B)$XYcor), fill=value, geom="tile")

The idea behind this plot is to show the cross-correlation between two multivariate sets, as img.matcor does:

img.matcor(matcor(data.A,data.B))

XYcor plot

In this picture, the upper-left and lower-right quadrants are the autocorrelation matrices of data.A and data.B, while the other quadrants are flipped versions of the cross-correlation matrix. Reordering the data as ggplot2 does destroys this relationship. (On the other hand, ggplot makes this look way better.)

Calling melt with factor.levels=FALSE does not fix this, unfortunately, and in the first place, melt orders the columns properly. Is there a workaround?

Community
  • 1
  • 1
bright-star
  • 6,016
  • 6
  • 42
  • 81

1 Answers1

1

I think that you get wrong plot with qplot() at least for sample data because you have the same column names in data.A and data.B and then in qplot() you get only part of actual correlations. If you change column names for one of data frames then qplot() looks more similar to img.matcor(). Only autocorrelation matrices now are one left bottom and right upper parts.

set.seed(1)
data.A <- as.data.frame(matrix(runif(c(25)),nrow=5,ncol=5))
names(data.A)<-c("A1","A2","A3","A4","A5")
data.B <- as.data.frame(matrix(runif(c(25)),nrow=5,ncol=5))
qplot(x=Var1, y=Var2, data=melt(matcor(data.A,data.B)$XYcor), 
              fill=value, geom="tile")

enter image description here

Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
  • That's good to know. Unfortunately it ruins the convenient syntax of `data.table` stuff like `matcor(mydata[Class==0,mycols,with=FALSE], mydata[Class!=0,mycols,with=FALSE])` – bright-star Mar 25 '14 at 11:43
  • 1
    I see your point but I'm not sure that you can use use the same column names because you get the same discrete names that are used for x and y axis and those data are overlaid on the plot in the same position. – Didzis Elferts Mar 25 '14 at 11:55