0

I know this question has been asked before, and I've looked at many of the links, but none of them seem to be helping my case. I'm plotting a line graph for 4 lines of different colors. But I can't get the legend to appear.

I've read that I need to put the color attribute in the aes part of the graph. That hasn't been successful either.

I have a data frame of four column, and 1000 rows. Here's a small reproducible example of what my data looks like, and how I'd like to plot it.

library(ggplot2)

vec1 <- c(0.1, 0.2, 0.25, 0.12, 0.3, 0.7, 0.41)
vec2 <- c(0.5, 0.4, 0.3, 0.55, 0.12, 0.12, 0.6)
vec3 <- c(0.01, 0.02, 0.1, 0.5, 0.14, 0.2, 0.5)
vec4 <- c(0.08, 0.1, 0.54, 0.5, 0.1, 0.12, 0.3)

df <- data.frame(vec1, vec2, vec3, vec4)

df_plot <- ggplot() +
  geom_line(data = df, color = "black", aes(x = c(1:7), y = df[,1], color = 
"black")) +
  geom_line(data = df, color = "blue", aes(x = c(1:7), y = df[,2], color = 
"blue")) +
  geom_line(data = df, color = "green", aes(x = c(1:7), y = df[,3], color = 
"green")) +
  geom_line(data = df, color = "yellow", aes(x = c(1:7), y = df[,4], color 
= "yellow")) +
  xlab("x axis") +
  ylab("y axis") +
  ggtitle("A random plot") +
  theme(legend.title = element_text("Four lines"), legend.position = 
"right")

(Also, did SO change the process of indenting code? Before, I could just press Ctrl + K to indent the entire block of code. But I can't do that anymore. Ctrl+K puts the cursor in my URL box for some reason)

I'd like ti it print the legend to the right of the graph.

neilfws
  • 32,751
  • 5
  • 50
  • 63
Zuhaib Ahmed
  • 487
  • 4
  • 14

1 Answers1

3

First: I see a lot of people here creating data frames by first creating individual vectors. I don't know where this practice originated but it isn't necessary:

df1 <- data.frame(vec1 = c(0.1, 0.2, 0.25, 0.12, 0.3, 0.7, 0.41),
                  vec2 = c(0.5, 0.4, 0.3, 0.55, 0.12, 0.12, 0.6),
                  vec3 = c(0.01, 0.02, 0.1, 0.5, 0.14, 0.2, 0.5),
                  vec4 = c(0.08, 0.1, 0.54, 0.5, 0.1, 0.12, 0.3))

Next: your data is in "wide" form. ggplot2 works better with "long" form: one column for variables, another for their values. You can get to that using tidyr::gather. While we're at it, we can use dplyr::mutate to add the x variable:

library(dplyr)
library(tidyr)
library(ggplot2)

df1 %>% 
  gather(Var, Val) %>% 
  mutate(x = rep(1:7, 4))

Now we can plot. With the data in this form, there is no need to use a separate geom for each variable and aes() automatically takes care of colors and legends. You can specify custom colors using scale_color_manual. I don't know that yellow or green are great choices, but here it is:

df1 %>% 
  gather(Var, Val) %>% 
  mutate(x = rep(1:7, 4)) %>% 
  ggplot(aes(x, Val)) + 
    geom_line(aes(color = Var)) + 
    scale_color_manual(values = c("black", "blue", "green", "yellow"))

enter image description here

The key is having your data in the correct format, and understanding how that allows aes to map variables to chart properties.

neilfws
  • 32,751
  • 5
  • 50
  • 63
  • Thank You. I'm not very familiar with the %>% notation that you used, but I'll look into it. – Zuhaib Ahmed Feb 05 '19 at 21:37
  • That's called a pipe, basically it takes the result of the operation on the left and passes it to the one on the right. There are lots of tutorials on it _e.g._ [this one](https://www.datacamp.com/community/tutorials/pipe-r-tutorial). – neilfws Feb 05 '19 at 21:40
  • Ah OK. Thank you again, for your help! – Zuhaib Ahmed Feb 05 '19 at 21:42