2

I am interested in displaying only the top 3 most abundant groups in my ggplot2 legend.

for example, in this table, i have 7 groups and i would only like to display groups D, E, F in my ggplot2 legend

group sample size
A 2
B 3
C 1
D 25
E 23
F 20
G 3

I tried searching online but the closest answers i got was to reorder the legend.

Thanks in advance!

Cheers, Mel

MelissaSoh
  • 21
  • 2
  • What kind of plot are you trying to do? With the data you provided it is not clear where a legend should be of need. – Maël Dec 02 '21 at 10:51
  • Please provide a [reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) and show the code you tried. – Mata Dec 02 '21 at 10:54

1 Answers1

5

You could achieve this by setting the categories to show up in the legend via the breaks argument of scale_fill_discrete:

df <- data.frame(
             group = c("A", "B", "C", "D", "E", "F", "G"),
       sample.size = c(2L, 3L, 1L, 25L, 23L, 20L, 3L)
)

library(ggplot2)
library(dplyr)

top_group <- df %>% top_n(3, sample.size) %>% pull(group)

ggplot(df, aes(group, sample.size, fill = group)) +
  geom_col() +
  scale_fill_discrete(breaks = top_group)

EDIT In case of scale_fill_manual one option would be to name your list of colors. This has the additional benefit that you could assign colors to names or Categories without bothering about the order in which you pass the colors to the values argument of the scale:

# Example color palette
colourslist <- scales::hue_pal()(length(unique(df$group)))
# Name your list of colors
names(colourslist) <- unique(df$group)

ggplot(df, aes("1", sample.size, fill = group)) +
  geom_col(width = 1, color="darkgrey") +
  scale_fill_manual(values = colourslist, breaks = top_group) +
  coord_polar("y", start = 0)

stefan
  • 90,330
  • 6
  • 25
  • 51
  • This probably answers part of the question (upvote), but it manually does the "displaying only the top 3 most abundant groups". You might want to annotate each row with a new column to highlight those 3 most abundant groups, then use the `scale_colour_manual()` – Paul Endymion Dec 02 '21 at 11:00
  • 1
    I'm sure you know that instead of painstakingly spelling out `c("A", "B", "C", "D", "E", "F", "G")`, you could also write `LETTERS[1:7]` ... ;) – tjebo Dec 02 '21 at 13:19
  • 1
    @tjebo Thank god,`datapasta` did the job for me. :D – stefan Dec 02 '21 at 13:21
  • 1
    Cannot believe I didn't know this one till now :D – tjebo Dec 02 '21 at 15:15
  • 1
    @tjebo Haha, yeah, I don't use it that often, but `datapasta` is good to know if you have to copy & paste data (quickly) from a PDF or an XL or from an HTML table. – stefan Dec 02 '21 at 15:31
  • @stefan, Thank you so much! I am currently using scale_fill_manual() where i specified the colours I am interested in (saved in colourslist). The breaks option does not work as i would like. When I use the breaks option, only the items in my list has colour and the rest is grey. top.list<- c( "Ambassis kopsii","Gerres oyena", "Monacanthus chinensis") ggplot(df, aes(x="", y=n, fill=Host))+ geom_bar(width = 1, stat = "identity",color="darkgrey")+ coord_polar("y", start=0)+ scale_fill_manual(values = colourslist, breaks=top.list) Thanks again! – MelissaSoh Dec 03 '21 at 06:19
  • Hi Melissa. Without a look at your colourslist I could only guess. My guess is that you simply pass a vector of colors which gives me grey bars too. To fix this issue you could name your colorlist. One option to fix this would be to name the colourslist. See my edit. – stefan Dec 03 '21 at 07:18
  • Hello @stefan, thank you so much! I managed to plot what I was looking for. – MelissaSoh Dec 08 '21 at 03:20