2

I am having problems with the ggpairs colour mapping. When the variable used to set the colour is a character (converted to a factor), things work as expected:

library(GGally)

data(state)
df <- data.frame(state.x77,
             State = state.name,
             Abbrev = state.abb,
             Region = state.region,
             Division = state.division
) 

col.index <- c(3,5,6,7)

p <- ggpairs(df, 

    # columns to include in the matrix
    columns = col.index,

    # what to include above the diagonal
    upper = list(continuous = "cor"),

    # what to include below the diagonal
    lower = list(continuous = "points"),

    # what to include in the diagonal
    diag = "blank",

    # how to label plots
    axisLabels = "show",

    # other aes() parameters
    legends=F,
    colour = "Region",
    title = "Plot Title"

)

print(p)

plot1

Note that the order of the colours in the correlation plots is: green, blue, red, purple.

However, when the variable used to set the colour is numeric (converted to a factor):

df.numeric <- df
df.numeric$Region <- as.character(df.numeric$Region)
df.numeric$Region[which(df.numeric$Region == "Northeast")] <- 1
df.numeric$Region[which(df.numeric$Region == "South")] <- 3
df.numeric$Region[which(df.numeric$Region == "North Central")] <- 10
df.numeric$Region[which(df.numeric$Region == "West")] <- 13
df.numeric$Region <- factor(df.numeric$Region, levels = c(1,3,10,13))

p <- ggpairs(df.numeric, 

    # columns to include in the matrix
    columns = col.index,

    # what to include above the diagonal
    upper = list(continuous = "cor"),

    # what to include below the diagonal
    lower = list(continuous = "points"),

    # what to include in the diagonal
    diag = "blank",

    # how to label plots
    axisLabels = "show",

    # other aes() parameters
    legends=F,
    colour = "Region",
    title = "Plot Title"

)

print(p)

plot2

I get a problem... despite the fact that I made sure the order of the levels is correct (1, 3, 10, 13).

For some reason the colors in the correlation plots have changed order - they are now green, purple, red, blue. However, notice the scatter plots look the same... this means the info no longer corresponds across plots.

I will be using a custom list of colours, each of which must correspond to a specific numerical group (in order to match other plots I am generating). Does anyone know how to fix this?

zx8754
  • 52,746
  • 12
  • 114
  • 209
user3570195
  • 71
  • 1
  • 8
  • Thanks for adding the images @hrbrmstr! This is my first question so I didn't have enough points to include them. – user3570195 Apr 24 '14 at 19:42
  • This is weird! had this hack for defining colors, but that doesn't work anymore either http://stackoverflow.com/questions/14711550/is-there-a-way-to-change-the-color-palette-for-ggally-ggpairs-using-ggplot. will have to look at the code of ggpairs and investigate – infominer Apr 24 '14 at 20:41
  • I think it's a problem with the ggally_cor function, since it doesn't seem to affect the scatter plots... – user3570195 Apr 24 '14 at 21:04
  • If someone could suggest an alternative approach, I would appreciate that as well. I have some deadlines to meet. Thanks! – user3570195 Apr 24 '14 at 22:49
  • do you need it to be ggplot based? have you looked at the examples in ?pairs, they have a neat example for putting correlation on upper triangles and also good ol' lattice http://stat.ethz.ch/R-manual/R-devel/library/lattice/html/splom.html – infominer Apr 24 '14 at 22:50
  • I'm afraid I do need to use ggplot, so that the formatting matches with the other plots I'm generating. I would be open to only computing the overall correlation (not group specific), but then I would need a separate legend for the groups. – user3570195 Apr 24 '14 at 22:59

0 Answers0