9

I was trying to plot some predicted vs. actual data, something that resembles the following:

# Some random data
x <- seq(1: 10)
y_pred <- runif(10, min = -10, max = 10)
y_obs <- y_pred + rnorm(10)
# Faking a CI
Lo.95 <- y_pred - 1.96
Hi.95 <- y_pred + 1.96
my_df <- data.frame(x, y_pred, y_obs, Lo.95, Hi.95)

ggplot(my_df, aes(x = x, y = y_pred)) +
  geom_line(aes(colour = "Forecasted Data"), size = 1.2) +
  geom_point(aes(x = x, y = y_obs, colour = "Actual Data"), size = 3) +
  geom_ribbon(aes(ymin=Lo.95, ymax=Hi.95, x=x, linetype = NA,  colour = "Confidence Interval"), alpha=0.2) +
  theme_grey() +
  scale_colour_manual(
    values = c("gray30", "blue", "red"),
    guide = guide_legend(override.aes = list(
      border=c(NA, NA, NA), 
      fill=c("gray30", "white", "white"),
      linetype = c("blank", "blank", "solid"),
      shape = c(NA, 19, NA)))) 

The plot looks like this:

enter image description here

The only issue I have with this plot is the red border surrounding the legend item symbol for the line (i.e. the forecasted data). Is there any way I can remove it without breaking the rest of my plot?

neilfws
  • 32,751
  • 5
  • 50
  • 63
Kev W.
  • 159
  • 3
  • 10
  • 1
    I'd suggest that the underlying issue here is the way you are mapping data to colours. `aes(color = "Actual Data")`, for example, is not really the correct usage. Normally we use `aes()` to map variables (column names, unquoted). I would gather the data to create a column indicating whether y was observed or predicted, then use `geom_point()` with `aes()` to colour points by that column. – neilfws Mar 19 '18 at 23:24

2 Answers2

10

I think geom_ribbon was the problem. If we take its color & fill out of aes, everything looks fine

library(ggplot2)

# Some random data
x <- seq(1: 10)
y_pred <- runif(10, min = -10, max = 10)
y_obs <- y_pred + rnorm(10)
# Faking a CI
Lo.95 <- y_pred - 1.96
Hi.95 <- y_pred + 1.96
my_df <- data.frame(x, y_pred, y_obs, Lo.95, Hi.95)

m1 <- ggplot(my_df, aes(x = x, y = y_pred)) +
  geom_point(aes(x = x, y = y_obs, colour = "Actual"), size = 3) +
  geom_line(aes(colour = "Forecasted"), size = 1.2) +
  geom_ribbon(aes(x = x, ymin = Lo.95, ymax = Hi.95), 
              fill = "grey30", alpha = 0.2) +
  scale_color_manual("Legend", 
                     values = c("blue", "red"),
                     labels = c("Actual", "Forecasted")) +
  guides( color = guide_legend(
    order = 1,
    override.aes = list(
                        color = c("blue", "red"),
                        fill  = c("white", "white"),
                        linetype = c("blank", "solid"),
                        shape = c(19, NA)))) +
  theme_bw() +
  # remove legend key border color & background
  theme(legend.key = element_rect(colour = NA, fill = NA),
    legend.box.background = element_blank())
m1

As we leave Confidence Interval out of aes, we no longer have its legend. One workaround is to create an invisible point and take one unused geom to manually create a legend key. Here we can use size/shape (credit to this answer)

m2 <- m1 +
  geom_point(aes(x = x, y = y_obs, size = "Confidence Interval", shape = NA)) +
  guides(size = guide_legend(NULL, 
                             order = 2,
                             override.aes = list(shape = 15, 
                                                 color = "lightgrey",
                                                 size = 6))) +
  # Move legends closer to each other
  theme(legend.title = element_blank(),
        legend.justification = "center",
        legend.spacing.y = unit(0.05, "cm"),
        legend.margin = margin(0, 0, 0, 0),
        legend.box.margin = margin(0, 0, 0, 0)) 
m2

Created on 2018-03-19 by the reprex package (v0.2.0).

Tung
  • 26,371
  • 7
  • 91
  • 115
0

A better way to address this question would be to specify show.legend = F option in the geom_ribbon(). This will eliminate the need for the second step for adding and merging the legend key for the confidence interval. Here is the code with slight modifications.

  ggplot(my_dff, aes(x = x, y = y_pred)) +
    geom_line(aes(colour = "Forecasted Data"), size = 1) +
    geom_point(aes(x = x, y = y_obs, colour = "Actual Data"), size = 1) +
    geom_ribbon(aes(ymin=Lo.95, ymax=Hi.95, x=x, linetype = NA,  colour = "Confidence Interval"), alpha=0.2, show.legend = F) +
    theme_grey() +
    scale_colour_manual(
      values = c("blue", "gray30", "red"))+
      guides(color = guide_legend(
        override.aes = list(linetype = c(1, 1, 0)), 
        shape = c(1, NA, NA),
        reverse = T))

My plot

Credit to https://stackoverflow.com/users/4282026/marblo for their answer to similar question.

Kay Jay
  • 57
  • 4