0

I would like to plot data with color and shape asthetics. This works fine, however I am struggling to produce a nice legend.

Here is a MWE using the mtcars dataset. The data manipulation is necessary to reprocude the dataset I am working on. The variable cylindex is artificially created to allow a color fill on the cylinder numbers and have a different shape for each observation within this category. The variable car is what I ideally want to use as legend labels.

library(tidyverse)
library(plotly)

data(mtcars)
mtcars <- mtcars[c(1,3,4,5,8,30),c("cyl","disp","hp")] %>% 
  as_tibble(rownames="car") %>%
  group_by(cyl) %>% 
  mutate(cylindex=rank(cyl,ties.method="first")) %>%
  ungroup() %>%
  # removed because 'car' is available from the row name
  # but this was used in the original question and one answer
  #mutate(brand=letters[1:nrow(.)]) %>% 
  mutate_at(c("cyl","cylindex"),as_factor) %>%
  mutate(legend=paste(cyl,cylindex,sep=" - "))

mtcars %>% 
  ggplot(aes(x=disp, y=hp, color=cyl, shape=cylindex)) +
  geom_point(size=10) -> g
print(g)
ggplotly(g)

mtcars %>% 
  ggplot(aes(x=disp, y=hp, color=legend, shape=legend)) +
  geom_point(size=10) +
  scale_color_manual(values=c("red","red","blue","blue","blue","green")) + 
  scale_shape_manual(values=c(16,17,16,17,15,16))

ggplot2 is producing the following: ![ggplot2 output

With ggplotly the legend is at least combined, however I struggle to change the labels automatically to the respective car. ![ggplotly output

Using the answer from @dc37 I managed to produce the legend similar to ggplotly but in a hardcoded way. I wrote next to each entry how I would like the legend labels. The information is from the car column that was added from the row names. enter image description here

How can I

- Combine both legends with ggplot2 (similar as ggploty does) without hardcoding the amount of data

- Change the legend labels automatically to the value in another variable (brand in this example)

Your help is much appreciated.

Archer
  • 39
  • 8
  • 3
    Have you seen this? https://stackoverflow.com/questions/12410908/combine-legends-for-color-and-shape-into-a-single-legend – Roman Luštrik Mar 19 '20 at 18:22
  • Not this one specifically to be honest, thanks for the pointer. It goes in the same direction as the answer from @dc37. – Archer Mar 20 '20 at 07:18

1 Answers1

2

For combining legends, you can create a new categorical column combining both variables:

mtcars %>% 
  filter(as.numeric(cylindex)<=2) %>%
  mutate(Legend = paste(cyl,cylindex, sep = ",")) %>% 
  ggplot(aes(x=disp, y=hp, color=Legend, shape=Legend)) +
  geom_point(size=10) +
  scale_color_manual(values = rep(c("red","blue","green"), each = 2))+
  scale_shape_manual(values = rep(c(16,18),3))

enter image description here

If you want to have brand variables as a label, simply use brand in your aes:

mtcars %>% 
  filter(as.numeric(cylindex)<=2) %>%
  ggplot(aes(x=disp, y=hp, color=brand, shape=brand)) +
  geom_point(size=10) +
  scale_color_manual(values = rep(c("red","blue","green"), each = 2))+
  scale_shape_manual(values = rep(c(16,18),3))

enter image description here

or using scale_color_manual and scale_shape_manual, you can associate each levels of the new "Legend" variables to a "brand" values:

mtcars %>% 
  filter(as.numeric(cylindex)<=2) %>%
  mutate(Legend = paste(cyl,cylindex, sep = ",")) %>% 
  ggplot(aes(x=disp, y=hp, color=Legend, shape=Legend)) +
  geom_point(size=10) +
  scale_color_manual(values = rep(c("red","blue","green"), each = 2), 
                     labels = c(`6,1` = "a",`6,2` = "b", `4,1` = "c",`4,2` = "h",`8,1` = "e", `8,2` = "g"))+
  scale_shape_manual(values = rep(c(16,18),3),
                     labels = c(`6,1` = "a",`6,2` = "b", `4,1` = "c",`4,2` = "h",`8,1` = "e", `8,2` = "g"))

enter image description here

Does it answer your question ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Thanks for your answer. Solution 1 and 3 are correct, while using the brand directly mixes up the color scale (c and e are both blue, although from different cyl categories). I made the example to simple by having exactly 2 entries per cylinder category, but I am actually looking for a flexible method because the numbers can change. I will change the example accordingly. – Archer Mar 20 '20 at 07:31