-1

I'm very new to R and ggplot and totally self-taught. I'm trying to recreate a chart that appears on a UK government Coronavirus website.

I can't seem to get ggplot to create a legend at all. I want the legend to show 3 things: (grey) Most recent 5 days (incomplete), (steelblue) Number of cases, and (blue line) Cases (7 day average).

I'm confident that I can make the axis look nice, format the dates, etc, I've just been struggling to do the legend.

Thanks for looking.

# Libraries
library(tidyverse)

#Data
England_Rates <- read.csv("https://api.coronavirus.data.gov.uk/v2/data?areaType=nation&areaCode=E92000001&metric=newCasesBySpecimenDate&format=csv")


England_Rates$AveragenewCasesBySpecimenDate <- rollapply(England_Rates$newCasesBySpecimenDate, 7, mean, align = "right", fill=NA)
England_Rates$data_colour <- "steelblue"
England_Rates$data_colour[1] <- NA
England_Rates$data_colour[2] <- "grey"
England_Rates$data_colour[3] <- "grey"
England_Rates$data_colour[4] <- "grey"
England_Rates$data_colour[5] <- "grey"
England_Rates$data_colour[6] <- "grey"

ggplot() +
  geom_col(aes(x = England_Rates$date, y = England_Rates$newCasesBySpecimenDate),colour=England_Rates$data_colour) +
  geom_line(aes(x = England_Rates$date, y = England_Rates$AveragenewCasesBySpecimenDate), size=1.2, colour="blue", group=1)

https://coronavirus.data.gov.uk/details/cases

WORKING CODE

library(tidyverse)

#Data
England_Rates <- read.csv("https://api.coronavirus.data.gov.uk/v2/data?areaType=nation&areaCode=E92000001&metric=newCasesBySpecimenDate&format=csv")


England_Rates$AveragenewCasesBySpecimenDate <- rollapply(England_Rates$newCasesBySpecimenDate, 7, mean, align = "right", fill=NA)
England_Rates$data_colour <- "Number of cases"
England_Rates$data_colour[1] <- NA
England_Rates$data_colour[2] <- "Most recent 5 days (incomplete)"
England_Rates$data_colour[3] <- "Most recent 5 days (incomplete)"
England_Rates$data_colour[4] <- "Most recent 5 days (incomplete)"
England_Rates$data_colour[5] <- "Most recent 5 days (incomplete)"
England_Rates$data_colour[6] <- "Most recent 5 days (incomplete)"

data <- data.frame (Date = as.Date(England_Rates_3$date),
                    Cases = England_Rates$newCasesBySpecimenDate.x,
                    "Cases (7-day Average)" = England_Rates$AveragenewCasesBySpecimenDate,
                    BlueorGrey = England_Rates$data_type)

group.colors <- c("Number of cases" = "#5694ca", "Most recent 5 days (incomplete)"="dark grey","Cases 7 day (Average)"="dark blue" )
ggplot(data,
       aes(x=Date, y=Cases, fill=BlueorGrey))+
  theme(
    text = element_text(color = "black"),
    axis.text=element_text(colour="black"),
    panel.background = element_rect(fill = "white",
                                    colour = "white",
                                    size = 0.5, linetype = "solid"),
    panel.grid.major = element_line(size = 0.5, linetype = 'twodash',
                                    colour = "light grey"), 
    panel.grid.minor = element_line(size = 0.25, linetype = 'twodash',
                                    colour = "light grey"),
    legend.title = element_blank(),
    legend.position = "bottom",
    legend.box = "vertical",
    axis.title=element_blank())+
  geom_bar(stat = "identity", width=1) + 
  scale_fill_manual(values=group.colors) +
  geom_line(aes(y=Cases..7.day.Average., fill="Cases 7 day (Average)"), colour="dark blue", size=0.8)
Ian
  • 9
  • 2
  • 2
    General comment; using `$`-subsetting in the `aes()` function may cause some weirdness every now and then. If you use `ggplot(data = England_Rates)`, then you can refer to the columns of that data inside the `aes()` function without needing to quote them. Also, you get a legend by putting the `colour` aesthetic inside the `aes()`. – teunbrand Apr 07 '21 at 17:19
  • Does this answer your question? [Add legend to ggplot2 line plot](https://stackoverflow.com/questions/10349206/add-legend-to-ggplot2-line-plot) – tjebo Apr 07 '21 at 17:46

1 Answers1

1

You don't need to manually assign colours to the plot before hand, using a mapped aesthetic generates a scale for you. You can assign a colour palette at the scale level. In the example below, we just put the text in the layer that we want to appear as a legend label.

# Libraries
library(tidyverse)
library(zoo)

#Data
England_Rates <- read.csv("https://api.coronavirus.data.gov.uk/v2/data?areaType=nation&areaCode=E92000001&metric=newCasesBySpecimenDate&format=csv")


England_Rates$AveragenewCasesBySpecimenDate <- rollapply(
  England_Rates$newCasesBySpecimenDate, 7, mean, align = "right", fill=NA
)
# as.Date() makes the axis nicer
# We declare the x-variable globally in the main ggplot call,
# so you don't need to repeat it in the geom-layers.
ggplot(England_Rates, aes(x = as.Date(date))) +
  geom_col(aes(y = newCasesBySpecimenDate)) +
  geom_line(aes(y = AveragenewCasesBySpecimenDate, 
                colour = "7 Day Rolling Average"), # Just fill in text or choose a column from data
            size=1.2, group=1) +
  scale_colour_manual(
    values = "blue" # Choose as many colours as you have legend items
  )
#> Warning: Removed 6 row(s) containing missing values (geom_path).

Created on 2021-04-07 by the reprex package (v1.0.0)

A similar thing works for the geom_col(), but I'd presume you want the fill aesthetic / scale for that one. Legend titles and other tweaks can be fiddled with at the scale_colour/fill_*() functions.

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • Hi, thanks for your help. It wasn't really what I needed, but it pointed me in the right direction - thanks. My question was slightly confused because, as a noob, I wasn't permitted to add images. I worked it out in the end - and I've added the new code to the question as I can't seem to find anywhere else to put it. Cheers again. – Ian Apr 11 '21 at 08:08