2

I am working with the R programming language.

I have the following data:

# "d"
structure(list(iteration = 1:20, State_1 = c(0.833333333333333, 
0.805555555555556, 0.782407407407408, 0.760030864197531, 0.738297325102881, 
0.717185356652949, 0.696677097622314, 0.676755282159732, 0.657403140559112, 
0.638604382721342, 0.620343184372478, 0.602604173741329, 0.585372418619872, 
0.568633413793723, 0.55237306883206, 0.536577696226679, 0.521233999870233, 
0.506329063863936, 0.491850341645319, 0.477785645426876), State_2 = c(0.166666666666667, 
0.166666666666667, 0.162037037037037, 0.157407407407407, 0.152906378600823, 
0.148533950617284, 0.144286551211706, 0.140160608139003, 0.136152648383123, 
0.132259298157039, 0.128477280146397, 0.124803410753146, 0.121234597415746, 
0.117767836005936, 0.114400208299943, 0.111128879522001, 0.107951095958113, 
0.104864182638058, 0.101865541083666, 0.0989526471214974), State_3 = c(0, 
0.0277777777777778, 0.0555555555555555, 0.0825617283950617, 0.108796296296296, 
0.134280692729767, 0.159036351165981, 0.183084109701265, 0.206444211057766, 
0.229136319121619, 0.251179535481126, 0.272592415505525, 0.293392983964383, 
0.313598750200341, 0.333226722867997, 0.35229342425132, 0.370814904171654, 
0.388806753498006, 0.406284117271016, 0.423261707451627)), row.names = c(NA, 
20L), class = "data.frame")

I am trying to plot the data as follows:

library(ggplot2)

p = ggplot() + geom_line(data = d, aes(x = iteration, y = State_1), color = "blue") + geom_line(data = d, aes(x = iteration, y = State_2), color = "red") + geom_line(data = d, aes(x = iteration, y = State_3), color = "green") + xlab('Number of Iterations') + ylab('Probability of Being in a Certain State') + ggtitle("Plot") theme(legend.position="bottom")

enter image description here

For some reason, the legend is not appearing even though I have specified the legend in the ggplot command. On top of that, apparently the legend should automatically be appearing in ggplot2 (https://r-graph-gallery.com/239-custom-layout-legend-ggplot2.html).

Does anyone know how to fix this?

Thanks!

Note: For some reason, the "melt" option seems to automatically fix this:

library(reshape2)
mdf <- reshape2::melt(d, id.var = "iteration")
ggplot(mdf, aes(x = iteration, y = value, colour = variable)) + 
    geom_point() + 
    geom_line()
stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • In your example, you would have to move the color argument into the aesthetics mapping, i.e. ```geom_line(data = d, aes(x = iteration, y = State_1, color = "State_1"))``` or reshape it to long format. – ansgar Jun 10 '22 at 19:30

2 Answers2

4

If you want color to be represented in the legend, you need to map the series names to the color aesthetic. You could do this with three separate calls to geom_line, using color = "Series name" inside aes, but it is more compact and elegant to pivot your data to long format. This puts all your x values in one column, your y values in another column, and creates a new column which labels each row with the series it came from. The modern way to do this is with pivot_longer from tidyr, which is part of the tidyverse.

library(tidyverse)

ggplot(pivot_longer(d, -1), aes(iteration, value, colour = name)) +
  geom_line()

enter image description here

To select the colors you want, you can use scale_color_manual. You can also change theme elements to suit your taste:

ggplot(pivot_longer(d, -1) %>% mutate(name = sub("_", " ", name)), 
       aes(iteration, value, colour = name)) +
  geom_line(size = 2, alpha = 0.5) +
  scale_color_manual(values = c("orange3", "green4", "purple4"), name = NULL) +
  theme_minimal(base_size = 20) +
  labs(y = NULL)

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
3

This is more or less the same as @Allan Cameron already provides.

I want to emphasize the rule of thumb: What is in aes will get a legend.

This helped me a lot to handle legend issues.

And additionally although not recommended here is the solution with the wide format of your data. In some cases it would be necessary to keep the wide format:

ggplot() + 
   geom_line(data = d, aes(x = iteration, y = State_1, color = "blue")) + 
   geom_line(data = d, aes(x = iteration, y = State_2, color = "red")) + 
   geom_line(data = d, aes(x = iteration, y = State_3, color = "green")) + 
   xlab('Number of Iterations') + 
   ylab('Probability of Being in a Certain State') 

enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66