2

i'm trying to plot multiple variables in the same ggplot and i want to change the color of the line based on the year. This is what i have so far but i want it to be easier to understand so im trying to do it:

ggplot(data = larceny_cases_districts, aes(x = Year, y = Cases))+
    geom_line(color = "#d62f53", size = 2 )+
    geom_point(shape = 21, color = "black", fill = "black", size = 6)+
    geom_text(aes(label = Cases), hjust=-1,vjust=0.1, color="#ff0000")+
    theme_ipsum()+
    ggtitle("Larceny Crimes in 2015-2018")

This is how my dataframe is set right now:

  larceny_cases_districts
  Year Cases
1 2015  2895
2 2016  4561
3 2017  4450
4 2018  2982

But i want it to look a bit like this so i can use colour = var_value to make multiple lines on geom_line() but i cant find a way to make this viable:

 larceny_cases_districts
              District 2015  2016  2017  2018
            1 A1      value value value value
            2 D4    value value value value
            3 B2    value value value value

My goal is to make a plot that has 3 lines and each one is the value of each year for each district

output of dput(head(larceny_cases_districts, 20)):

structure(list(District = c("A1", "D4", "B2"), `2015` = c(10L, 
    6L, 1L), `2016` = c(13L, 8L, 8L), `2017` = c(10L, 2L, 6L), `2018` = c(13L, 
    2L, 3L)), row.names = c("1", "2", "3"), class = "data.frame")

2 Answers2

2

This type of problems generaly has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from wide to long format.

If the original data already is in long format, there is no need for the dplyr/tidyr pipe prior to the plotting instructions below.

library(dplyr)
library(tidyr)
library(ggplot2)
library(hrbrthemes)

larceny_cases_districts %>%
  pivot_longer(
    cols = starts_with('20'),
    names_to = 'Year',
    values_to = 'Cases'
  ) %>%
  mutate(Year = as.integer(Year)) %>%
  ggplot(aes(Year, Cases, fill = District)) +
  geom_line() +
  geom_point(shape = 21, color = "black", size = 6) +
  geom_text(aes(label = Cases), hjust = -1, vjust = 0.1, color = "#ff0000") +
  ggtitle("Larceny Crimes in 2015-2018") +
  theme_ipsum()

enter image description here

Data

larceny_cases_districts <- 
structure(list(District = c("A1", "D4", "B2"), `2015` = c(10L, 
    6L, 1L), `2016` = c(13L, 8L, 8L), `2017` = c(10L, 2L, 6L), 
    `2018` = c(13L, 2L, 3L)), row.names = c("1", "2", "3"), 
    class = "data.frame")
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
1

Maybe you are looking for this:

#Code
ggplot(data = larceny_cases_districts, aes(x = Year,
                                           y = Cases,
                                           color=factor(Year),
                                           group=1,
                                           fill=factor(Year)))+
  geom_line(size = 2 )+
  geom_point(shape = 21, color = "black", size = 6)+
  geom_text(aes(label = Cases), hjust=-1,vjust=0.1, color="#ff0000")+
  theme_ipsum()+
  ggtitle("Larceny Crimes in 2015-2018")

Output:

enter image description here

Some data used:

#Data
larceny_cases_districts <- structure(list(Year = 2015:2018, Cases = c(2895L, 4561L, 4450L, 
2982L)), class = "data.frame", row.names = c("1", "2", "3", "4"
))
Duck
  • 39,058
  • 13
  • 42
  • 84