0

Using multiple functionalities of the ggplot2 package in order to make a plot with geom_line, strange output is produced when several of them are combined like in the following:

Necessary libraries:

library(ggplot2)
library(dplyr)
library(plotly)

Sample dataset:

df <- data.frame(a = sample(LETTERS, 10, replace = FALSE), 
                 b = rnorm(10, mean = 2, sd = 5),
                 c = rnorm(10, mean = 15, sd = 5),
                 d = sample(letters, 10, replace = FALSE))

The plot:

ggplot(df, aes(x = a)) + 
  geom_line(aes(y = b, group = 1, color = "line_one"), size = 2, alpha = 0.6) +
  geom_line(aes(y = c, group = 1, color = "line_two"), size = 2, alpha = 0.6) +
  scale_y_continuous(sec.axis = sec_axis(~. + 10)) +
  labs(x = "My x axis",
   y = "My y axis") +
  theme(axis.text = element_text(angle = 90, hjust = 0.4, vjust = -0.5)) +
  geom_hline(yintercept = df$b %>% quantile(.99), 
             size = 2, 
             color = "tomato", 
             linetype = "dashed",
             alpha = 0.6) +
  scale_color_manual(
    name = "", 
    values = c("line_one" = "red", "line_two" = "blue")
  ) + 
  theme_light()

The output produced does not show the color of the lines in the legend box:

enter image description here

adl
  • 1,390
  • 16
  • 36
  • Can't reproduce (I get corresponding colors in the legend) . You might want to update your packages. – pogibas Jan 03 '19 at 14:38

1 Answers1

1

You should always use long data when using ggplot2. By doing so, you don't need to add each and every single line as a layer and you also don't need to manually adjust legend etc. Here an example based on yours (I used tidyr::gather() to transform to long):

df <- data.frame(a = sample(LETTERS, 10, replace = FALSE), 
                 line_one = rnorm(10, mean = 2, sd = 5),
                 line_two = rnorm(10, mean = 15, sd = 5),
                 d = sample(letters, 10, replace = FALSE))

df %>% 
  tidyr::gather(key = line, value = value, line_one, line_two) %>% 
  ggplot(aes(x = a, y = value, color = line, group = line)) +
  geom_line() +
  scale_color_manual(
    name = NULL, 
    values = c("line_one" = "red", "line_two" = "blue")
  ) +
  geom_hline(yintercept = df$line_one %>% quantile(.99), 
             size = 2, 
             color = "tomato", 
             linetype = "dashed",
             alpha = 0.6)

Edit:

Another example showing the flexibility of long data with different layers:

library(ggplot2)
library(tidyr)
set.seed(1)
df_long <- data.frame(
  x = 1:10,
  a = rnorm(10),
  b = rnorm(10),
  c = rnorm(10),
  d = rnorm(10)
) %>% 
  gather(key = key, value = y, -x)

ggplot(mapping = aes(x = x, y = y, fill = key, color = key)) +
  geom_col(data = subset(df_long, key %in% "a")) +
  geom_line(data = subset(df_long, key %in% c("b", "c"))) +
  geom_point(data = subset(df_long, key %in% "d"))
Tino
  • 2,091
  • 13
  • 15
  • yes, but sometimes we might want to add a bar layer or a point layer instead of having all of them being lines, like for example in this post: https://stackoverflow.com/questions/47023781/how-to-add-a-legend-for-two-geom-layers-in-one-ggplot2-plot – adl Jan 03 '19 at 14:35
  • Long still works best for that. Imagine you have 3 different values for each x (hence 3 colors) you want to represent as bars and only one line. You `gather` all and use subsets in the 2 layers with the desired keys instead of adding more and more layers for each line or bar you want to print (special case with bars: they will be printed on top of the previous, not stacked but behind). – Tino Jan 03 '19 at 14:42