42

I recently upgraded to ggplot2 0.9.0 from version 0.8.9, and now I'm getting that my plot legends only show the factor levels used in the plot (it omits the unused ones). Before it'd include all factor levels in the legend. I'm running Windows 7 and R 2.15.0 (2.14.2 before today).

Is anyone else finding this too? Is there a way I can get the unused factor levels to display in my plot legend?

library(ggplot2)

df <- data.frame(fruit = rep(c("apple", "orange"), times=11), 
                 year = 1990:2011, 
                 qty = rnorm(22, 100, 20))

# This plot only gives "apple" in the legend now.
# Before, I used to get both "apple" and "orange". 
qplot(year, qty, data = subset(df, fruit=="apple"), colour = fruit) 

The qplot() used to give me both "apple" and "orange" in the legend (even though there were only points for "apple"). Now I only get "apple" in the legend.

Reason this came up - I am making many plots of subsets of a data set and I want the legends standardized across plots (normally I'd appreciate the unused levels being automatically dropped and not having to type droplevels(), but this is the one case I want those unused levels). Apologies if this is a question local to my computer only.


Edit: Note that as of R 4.0.0, the above code no longer produces a df$fruit as a factor, which changes the behavior of ggplot in the question and answers below. To reproduce use:

df <- data.frame(
  fruit = factor(rep(c("apple", "orange"), times=11)), 
  year = 1990:2011, 
  qty = rnorm(22, 100, 20)
)
Axeman
  • 32,068
  • 8
  • 81
  • 94
N. Sarkar
  • 433
  • 1
  • 4
  • 6

2 Answers2

54

Yes, you want to add drop = FALSE to your colour scale:

ggplot(subset(df,fruit == "apple"),aes(x = year,y = qty,colour = fruit)) + 
    geom_point() + 
    scale_colour_discrete(drop = FALSE)
joran
  • 169,992
  • 32
  • 429
  • 468
  • @ToddWest, this is because the example data generated by `data.frame` used to create `df$fruit` as a `factor`, but now (as of R 4.0.0) `stringsAsFactors` defaults to `TRUE` and so it is a character vector. It's not a `ggplot` related change. – Axeman Nov 18 '21 at 00:17
  • Thanks, @Axeman. I put in a clarifying edit on this answer. The docs for base 3.6.2 do have `stringsAsFactors` defaulting to TRUE but, at least in my R 4.1.1 installation, stringsAsFactors is reconfigured to default to FALSE. I think the latter's probably more what you meant (it's definitely not a change I made explicitly). – Todd West Nov 18 '21 at 01:03
  • @ToddWest, Yes, I meant `stringsAsFactors` has been changed to `FALSE`. I misspoke in my earlier comment. Apologies for the confusion. – Axeman Nov 18 '21 at 22:29
14

A second way is to explicitly define the required entries by using the limits argument:

ggplot(subset(df,fruit == "apple"),aes(x = year,y = qty,colour = fruit)) + 
    geom_point() + 
    scale_colour_discrete(limits = c("apple", "orange"))
Axeman
  • 32,068
  • 8
  • 81
  • 94
  • 6
    For some reasons, this worked for scale_color_manual() and drop=FALSE didn't – zer0hedge Sep 04 '17 at 10:51
  • 1
    To answer @zer0hedge's question of four years ago, drop = FALSE is specific to retention of factor levels. So if the fill aesthetic is not a factor (it's a vector repeating certain strings or specific integer values, say) then setting drop = FALSE has no effect. Instead, it's necessary to list out all of these "levels" as limits (as of ggplot 3.3.5). – Todd West Nov 18 '21 at 00:03