0

I have this dataframe that I made. The columnnames are the years. Let's say you have this dataframe, named fruits:

      apples | bananas | apples

2012 | 23 | 32 | 23

2013 | 26 | 29 | 23

2014 | 32 | 43 | 25

2015 | 23 | 43 | 26

2016 | 21 | 46 | 22

How do you make a line graph out of this? I assume the best way is to use ggplot. I used the following code:

years = seq(from = 2012, to = 2016)

ggplot(data = fruits, mapping = aes(x = years)) +
  geom_line(aes(y = apples), color = "green")+
  geom_line(aes(y = bananas), color = "yellow")+
  geom_line(aes(y = apples), color = "indianred")

This gives me a plot without a legend. When I search on the internet, I only find examples of graphs that give a legend right away. Is something wrong with my code? And is there a faster way or a way that isn't as hardcoded to have multiple lines in a graph as my way?

stefan
  • 90,330
  • 6
  • 25
  • 51
Ahek
  • 47
  • 4
  • 1
    If you want a legend then you have to map on aesthetics, i.e. move `color=..` into `aes()`. To set the colors you have to make use of `scale_color_manual(values = ...)`. See also e.g. Add legend to ggplot2 line plot](https://stackoverflow.com/questions/10349206/add-legend-to-ggplot2-line-plot) – stefan Nov 08 '20 at 10:53

1 Answers1

1

This type of problems generaly has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from wide to long format.

Define a colors vector.

clrs <- c("green", "yellow", "indianred")
clrs <- setNames(clrs, names(fruits))

Since the posted data set has two columns named "apple" and R doesn't accept it, the second column with that name becomes "apple.1". The following sub will remove the final ".1" from the second column named "apple".

names(fruits)
#[1] "apples"   "bananas"  "apples.1"

sub("\\.\\d$", "", names(clrs))
#[1] "apples"  "bananas" "apples" 

And now plot it.

library(dplyr)
library(tidyr)
library(ggplot2)

fruits %>%
  tibble::rownames_to_column(var = "years") %>%
  pivot_longer(
    cols = -years,
    names_to = "fruit",
    values_to = "value"
  ) %>%
  mutate(years = as.integer(years)) %>%
  ggplot(aes(years, value, colour = fruit)) +
  geom_line() +
  scale_colour_manual(breaks = names(clrs), values = clrs, 
                      labels = sub("\\.\\d$", "", names(clrs))) +
  theme_bw()

enter image description here

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66