5

Hopefully someone here will be able to help me with a problem that I'm having with a ggplot script I'm trying to get right. The script will be used many times with different data, so it needs to be relatively flexible. I've got it almost where I want it, but I've come across a problem I haven't been able to solve.

The script is for a line graph with labels for each line in the right hand margin. Sometimes the graph is faceted, other times it is not.

The piece I'm having trouble with is that I would like to color code the labels in the right margin as black if there was no significant change over time, green if there was positive change, and red if there was negative change. I've got a script that works to carry this out when I only have a single facet, but as soon as I have multiple facets in the graph, the color coding of the labels gives the following error

   Error: Incompatible lengths for set aesthetics:

Below is the script with data with multiple facets. The problem seems to be in the way that I'm specifying color in the geom_text line. If I delete the color call in the geom_text line in the script, then I get the attributes printed in the correct place, just not colored. I'm really at a loss on this one. This is my first post here, so let me know if I've done anything wrong with my post.

WITH MULTIPLE FACETS (DOES NOT WORK)

   require(ggplot2)
require(grid)
require(zoo)
require(reshape)
require(reshape2)
require(directlabels)

time.data<-structure(list(Attribute = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 
                                                  4L, 4L, 5L, 5L, 6L, 6L), .Label = c("Taste 1", "Taste 2", "Taste 3", 
                                                                                      "Use 1", "Use 2", "Use 3"), class = "factor"), Attribute.Category = structure(c(2L, 
                                                                                                                                                                      2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Nutritional/Usage", 
                                                                                                                                                                                                                              "Taste/Quality"), class = "factor"), Attribute.Order = c(1L, 
                                                                                                                                                                                                                                                                                       1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L), Category.Order = c(1L, 
                                                                                                                                                                                                                                                                                                                                                       1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), Color = structure(c(1L, 
                                                                                                                                                                                                                                                                                                                                                                                                                        1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L), .Label = c("#084594", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                "#2171B5", "#4292C6", "#6A51A3", "#807DBA", "#9E9AC8"), class = "factor"), 
                          value = c(75L, 78L, 90L, 95L, 82L, 80L, 43L, 40L, 25L, 31L, 
                                    84L, 84L), Date2 = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 2L, 
                                                                   1L, 2L, 1L, 2L, 1L), .Label = c("1/1/2013", "9/1/2012"), class = "factor")), .Names = c("Attribute", 
                                                                                                                                                           "Attribute.Category", "Attribute.Order", "Category.Order", "Color", 
                                                                                                                                                           "value", "Date2"), class = "data.frame", row.names = c(NA, -12L
                                                                                                                                                           ))

label.data<-structure(list(7:12, Attribute = structure(1:6, .Label = c("Taste 1", 
                                                                       "Taste 2", "Taste 3", "Use 1", "Use 2", "Use 3"), class = "factor"), 
                           Attribute.Category = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("Nutritional/Usage", 
                                                                                                "Taste/Quality"), class = "factor"), Attribute.Order = 1:6, 
                           Category.Order = c(1L, 1L, 1L, 2L, 2L, 2L), Color = structure(1:6, .Label = c("#084594", 
                                                                                                         "#2171B5", "#4292C6", "#6A51A3", "#807DBA", "#9E9AC8"), class = "factor"), 
                           Significance = structure(c(2L, 3L, 1L, 1L, 3L, 2L), .Label = c("neg", 
                                                                                          "neu", "pos"), class = "factor"), variable = structure(c(1L, 
                                                                                                                                                   1L, 1L, 1L, 1L, 1L), .Label = "1/1/2013", class = "factor"), 
                           value = c(78L, 95L, 80L, 40L, 31L, 84L), Date2 = structure(c(1L, 
                                                                                        1L, 1L, 1L, 1L, 1L), .Label = "2013-01-01", class = "factor"), 
                           label.color = structure(c(1L, 2L, 3L, 3L, 2L, 1L), .Label = c("black", 
                                                                                         "forestgreen", "red"), class = "factor")), .Names = c("", 
                                                                                                                                               "Attribute", "Attribute.Category", "Attribute.Order", "Category.Order", 
                                                                                                                                               "Color", "Significance", "variable", "value", "Date2", "label.color"
                                                                                         ), class = "data.frame", row.names = c(NA, -6L))

color.palette<-as.character(unique(time.data$Color))

time.data$Date2<-as.Date(time.data$Date2,format="%m/%d/%Y")

plot<-ggplot()+
  geom_line(data=time.data,aes(as.numeric(time.data$Date2),time.data$value,group=time.data$Attribute,color=time.data$Color),size=1)+
  geom_text(data=label.data,aes(x=Inf, y=label.data$value, label=paste("  ",label.data$Attribute)),
            color=label.data$label.color,
            size=4,vjust=0, hjust=0,na.rm=T)+
  facet_grid(Attribute.Category~.,space="free")+
  theme_bw()+
  scale_x_continuous(breaks=as.numeric(unique(time.data$Date2)),labels=format(unique(time.data$Date2),format = "%b %Y"))+
  theme(strip.background=element_blank(),
        strip.text.y=element_blank(),
        legend.text=element_blank(),
        legend.title=element_blank(),
        plot.margin=unit(c(1,5,1,1),"cm"),
        legend.position="none")+
  scale_colour_manual(values=color.palette)

gt3 <- ggplot_gtable(ggplot_build(plot))
gt3$layout$clip[gt3$layout$name == "panel"] <- "off"
grid.draw(gt3)
tkvaran
  • 53
  • 1
  • 4
  • Welcome to StackOverflow. Good to see you supplying data and code. May I suggest you update your code to supply values for the two variables `plot.start` and `plot.end`, which are at this point undefined. – SlowLearner Jan 17 '13 at 20:16
  • @user19686010 I suspect that the reason that nobody has touched this yet is that it is not a 'minimal' example i.e. you have given us a lot of (quite verbose) code instead of a small tight example that is easy for third parties to understand in a short time. This obscures the real nature of the problem and reduces the value of the question and the answer. I have fallen into this trap myself on SO. As [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) very useful post points out, you should supply the minimal runnable code. – SlowLearner Jan 17 '13 at 20:29
  • Thanks SlowLearner. I'll try to put something more minimal tonight. For now, I've updated the code to include plot.start and plot.end. – tkvaran Jan 17 '13 at 20:38
  • You have two large data frames (`time.data` and `label.data`) so one way to approach this would be to create 2 similar data frames with only, say, 3-4 dates for the x-axis, 3 x `Attribute` instead of 10 (short text like `High`, `Low`, `Med` please!), 2 x `Attribute.Category`. Also do you really need `Significance`, `variable` and the other columns for **this** example? If not, remove all those. Remove `major.line.color`, `ymin` and the other declarations - not relevant to the question. Finally, omit the `theme()` stuff: it's just cosmetic tweaking you can add back later. Good luck! – SlowLearner Jan 17 '13 at 20:51
  • Thanks again Slowlearner. I've just edited my original response with a more minimal example. I did leave some of the theme stuff, but only what I thought was really critical. – tkvaran Jan 17 '13 at 22:32
  • One note. If you comment out the "color=" line of the geom_text in the above code, the graph will run with the exception that the color coding of the labels in the right margin does not work. When working correctly, the labels should be colored according to the values in the label.data$label.color variable. – tkvaran Jan 17 '13 at 22:41

1 Answers1

4

Some problems:

Inside your aesthetic declarations, you should not be referencing the data columns as time.data$Date2, but just as Date2. The data argument specifies where to look for that information (which needs to all be in the same data.frame for a given layer, but, as you take advantage of, can vary layer to layer).

In the geom_text call, color was not inside the aes call; if you are mapping it to data which is in the data.frame, you have to have it inside the aes call. This would throw a different error after fixing the first part because then it would not be able to find label.color anywhere because it would not know to look inside label.data.

Fixing those, then the scale_colour_manual complains that there are 9 colors and you have only supplied 6. That is because there are 6 colors from the lines and 3 from the text. Since you specified these as actual color names, you can just use scale_colour_identity.

Putting this all together:

plot <- ggplot()+
  geom_line(data=time.data, aes(as.numeric(Date2), value, 
                                group=Attribute, color=Color), 
            size=1)+
  geom_text(data=label.data, aes(x=Inf, y=value, 
                                 label=paste("  ",Attribute),
                                 color=label.color),
            size=4,vjust=0, hjust=0)+
  facet_grid(Attribute.Category~.,space="free") +
  scale_x_continuous(breaks=as.numeric(unique(time.data$Date2)),
                     labels=format(unique(time.data$Date2),format = "%b %Y")) +
  scale_colour_identity() +
  theme_bw()+
  theme(strip.background=element_blank(),
        strip.text.y=element_blank(),
        legend.text=element_blank(),
        legend.title=element_blank(),
        plot.margin=unit(c(1,5,1,1),"cm"),
        legend.position="none")
gt3 <- ggplot_gtable(ggplot_build(plot))
gt3$layout$clip[gt3$layout$name == "panel"] <- "off"
grid.draw(gt3)

enter image description here

To get an idea how much you can strip down your example, this is much closer to minimal:

time.data <- 
structure(list(Attribute = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 
4L, 4L), .Label = c("Taste 1", "Taste 2", "Use 1", "Use 2"), class = "factor"), 
    Attribute.Category = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 
    1L, 1L), .Label = c("Nutritional/Usage", "Taste/Quality"), class = "factor"), 
    Color = c("#084594", "#084594", "#2171B5", "#2171B5", "#6A51A3", 
    "#6A51A3", "#807DBA", "#807DBA"), value = c(75L, 78L, 90L, 
    95L, 43L, 40L, 25L, 31L), Date2 = structure(c(15584, 15706, 
    15584, 15706, 15584, 15706, 15584, 15706), class = "Date")), .Names = c("Attribute", 
"Attribute.Category", "Color", "value", "Date2"), row.names = c(NA, 
-8L), class = "data.frame")

label.data <- 
structure(list(value = c(78L, 95L, 40L, 31L), Attribute = structure(1:4, .Label = c("Taste 1", 
"Taste 2", "Use 1", "Use 2"), class = "factor"), label.color = c("black", 
"forestgreen", "red", "forestgreen"), Attribute.Category = structure(c(2L, 
2L, 1L, 1L), .Label = c("Nutritional/Usage", "Taste/Quality"), class = "factor"), 
    Date2 = structure(c(15706, 15706, 15706, 15706), class = "Date")), .Names = c("value", 
"Attribute", "label.color", "Attribute.Category", "Date2"), row.names = c(NA, 
-4L), class = "data.frame")

ggplot() +
  geom_line(data = time.data, 
            aes(x=Date2, y=value, group=Attribute, colour=Color)) +
  geom_text(data = label.data,
            aes(x=Date2, y=value, label=Attribute, colour=label.color),
            hjust = 1) +
  facet_grid(Attribute.Category~.) +
  scale_colour_identity()

enter image description here

The theme stuff (and the making the labels visible outside the plot) isn't relevant to the question, nor is the x-axis conversions from Date to numeric to handle having Inf. I also trimmed the data to just the needed columns, and reduced categorical variable to only two categories.

Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
  • Brian, Thanks SO much for your help. Both with the specific example and with the more general lessons. Very insightful to see what a true minimal example would look like. I don't know how I'd missed scale_color_identity before. – tkvaran Jan 17 '13 at 23:49
  • Very nice answer (I tried to solve this but failed). In my view a minimal example would go even further, by using one of the built-in datasets, like `iris` or `mtcars`. – Andrie Jan 18 '13 at 06:59
  • 1
    @Andrie I also tend to prefer examples with built in datasets, but here I think they would miss two salient features: a pair of datasets, one with data and one with annotations; and non-overlapping colors specified explicitly in both datasets. – Brian Diggs Jan 18 '13 at 14:55