2

Is there a way in R / ggplot2 of re-ordering the legend to match the it's line position?

So in this example the blue non melanoma skin cancer would be top in the legend.

all_nhs_data <- read_csv("https://www.opendata.nhs.scot/dataset/c2c59eb1-3aff-48d2-9e9c-60ca8605431d/resource/3aef16b7-8af6-4ce0-a90b-8a29d6870014/download/opendata_inc9418_hb.csv")

borders_hb_cncr <- all_nhs_data %>% 
  filter(HB == "S08000016") %>% 
  select(CancerSite, Sex, Year, IncidencesAllAges, CrudeRate)

individual_viz <- borders_hb_cncr %>% 
  filter(CancerSite != "All cancer types") %>% 
 filter(case_when(
   IncidencesAllAges >=50 & Year == 2018 ~ Sex == "All",
   TRUE ~ Sex == "All" & IncidencesAllAges >50
             )) %>%  
  ggplot() +
  aes(x = Year, y = IncidencesAllAges, group = CancerSite, colour = CancerSite) +
  geom_line()

Screen Shot

duckmayr
  • 16,303
  • 3
  • 35
  • 53
Kerr McIntosh
  • 121
  • 1
  • 2
  • 9
  • possible alternative plots: https://stackoverflow.com/questions/29357612/plot-labels-at-ends-of-lines ; https://stackoverflow.com/questions/17492230/how-to-place-grobs-with-annotation-custom-at-precise-areas-of-the-plot-region/17493256#17493256 – user20650 Aug 26 '20 at 01:12

2 Answers2

3

The forcats package (part of the tidyverse suite) has a function called fct_reorder2 which is intended for cases like this.

The default function in fct_reorder2 is last2(), which reorders a factor (CancerSite) based on the last value of y (IncidencesAllAges) when sorted by x (Year). See the final example here.

library(tidyverse)

borders_hb_cncr %>% 
  filter(CancerSite != "All cancer types",
         case_when(
           IncidencesAllAges >=50 & Year == 2018 ~ Sex == "All",
           TRUE ~ Sex == "All" & IncidencesAllAges >50
         )) %>%  
  ggplot() +
  aes(x = Year, 
      y = IncidencesAllAges, 
      group = CancerSite, 
      colour = fct_reorder2(CancerSite, 
                            Year, 
                            IncidencesAllAges)) +
  geom_line() +
  labs(colour = 'Cancer Site') 

enter image description here

nniloc
  • 4,128
  • 2
  • 11
  • 22
1

My first instinct is to make CancerSite a factor and order it in the level statement the way you want. Might be a way to do it by the value of CancerSite in 2018, which would allow you to reuse code across plot permutations. But for this, I just went with converting to factor. It does change colors from the original. But you can manipulate them manually.

borders_hb_cncr %>% 
filter(CancerSite != "All cancer types") %>% 
filter(case_when(
    IncidencesAllAges >=50 & Year == 2018 ~ Sex == "All",
    TRUE ~ Sex == "All" & IncidencesAllAges >50)) %>%  
mutate(CancerSite = factor(CancerSite, 
                    levels = c("Non-melanoma skin cancer", "Basal cell carcinoma of the skin",
                                "Breast", "Colon", "Colorectal cancer", 
                                "Squamous cell carcinoma of the skin", 
                                "Trachea, bronchus and lung"))) %>%
ggplot() +
aes(x = Year, y = IncidencesAllAges, colour = CancerSite) +
geom_line()

enter image description here

greg dubrow
  • 623
  • 6
  • 9