0

For my data analysis course, I need to study some financial result from a fictive company.

I'm currently on a very simple dataframe, here it is :

structure(
list(
month = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), 
`2021` = c(0, 0, 482.44061, 476.1093, 492.94347, 484.08856, 482.8354, 482.28479, 507.24068, 320.79874, 516.16773, 525.91728), 
`2022` = c(525.33899, 535.5715, 514.60641, 492.99894, 517.1326, 496.01612, 510.78312, 506.46727, 494.11453, 507.91777, 496.66494, 510.2195),
`2023` = c(517.54055, 456.67976, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
   ), 
row.names = c(NA, -12L), 
class = "data.frame"
        )

Which give us something similar to this

#     month 2021 2022 2023
# 1     1      0  525  518
# 2     2      0  536  457
# 3     3    482  515    0
# 4     4    476  493    0
# 5     5    493  517    0
# 6     6    484  496    0

I'm trying to display a simple ggplot with month in X and values in Y, I have 3 year so I'm gonna need 3 colors

    ggplot(gmv_month, aes(x = month)) +
  geom_line(aes(y = `2021`, group = 1, color = "red")) +
  geom_line(aes(y = `2022`, group = 1, color = "blue")) +
  geom_line(aes(y = `2023`, group = 1, color = "green")) +
  geom_point(aes(y = `2021`, group = 1, color = "red")) +
  geom_point(aes(y = `2022`, group = 1, color = "blue")) +
  geom_point(aes(y = `2023`, group = 1, color = "green")) +
  scale_y_continuous(name="GMV Kâ‚Ĵ", 
                     limits=c(0, 600), 
                     labels = label_number()
  ) +
  scale_x_continuous("Month", breaks = seq(0,12,1)) +
  labs(title = "Chiffre d'affaire par mois") +
  scale_color_manual(labels = c("2021", "2022", "2023"), 
                     values = c("red", "blue", "green")
  ) 

I'm sorry I can't print images because I'm new here...

In the result :

2021, who is supposed to be red, is now green and 2023 in the legend

2022, who is supposed to be blue, is now red and 2021 in the legend

2023, who is supposed to be green, is now blue and 2022 in the legend

?????

What is wrong with my code ?

Thank you very much

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • Try to put the `color = ...` statements outside the `aes()` function; e.g. `geom_line(aes(y = `2021`, group = 1), color = "red")` – teunbrand Mar 09 '21 at 11:35
  • 2
    [Reshape from wide-to-long](https://stackoverflow.com/q/2185252/680068) before plotting, will make your code so much cleaner. – zx8754 Mar 09 '21 at 11:38

2 Answers2

1

As already said in the comments. data in long format is your friend.

library(tidyverse)
df %>% 
  pivot_longer(-1) %>% # this line makes the data long
  ggplot(aes(month, value, color = name)) + 
   geom_point() + 
   geom_line() +
   scale_color_manual(labels = c("2021", "2022", "2023"), 
                 values = c("red", "blue", "green")) 
Roman
  • 17,008
  • 3
  • 36
  • 49
0

This seems to yield the expected result you were hoping for.

  • 2021 is red
  • 2022 is blue
  • 2023 is green
library(tidyverse)

gmv_month <- structure(
  list(
    month = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), 
    `2021` = c(0, 0, 482.44061, 476.1093, 492.94347, 484.08856, 482.8354, 482.28479, 507.24068, 320.79874, 516.16773, 525.91728), 
    `2022` = c(525.33899, 535.5715, 514.60641, 492.99894, 517.1326, 496.01612, 510.78312, 506.46727, 494.11453, 507.91777, 496.66494, 510.2195),
    `2023` = c(517.54055, 456.67976, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
  ), 
  row.names = c(NA, -12L), 
  class = "data.frame"
)

ggplot(gmv_month, aes(x = month)) +
  geom_line(aes(y = `2021`, group = 1), color = "red") +
  geom_line(aes(y = `2022`, group = 1), color = "blue") +
  geom_line(aes(y = `2023`, group = 1), color = "green") +
  geom_point(aes(y = `2021`, group = 1), color = "red") +
  geom_point(aes(y = `2022`, group = 1), color = "blue") +
  scale_color_manual(labels = c("2021", "2022", "2023"), 
                     values = c("red", "blue", "green")
  ) 

I am afraid I cannot post images just yet, but it does look correct to me.

Daniel D
  • 69
  • 4