I have several dataframes that I use to create plots. In these plots, I would like the fill
-colour to be based on the factor levels of a certain variable, but in some of the dataframes, some factors do not exist (e.g. out of 3 different potential factor levels, only 2 actually occur), which makes the colours across graphs for similar factor levels different. I would like them to be the same. So far I have assigned the desired colours as a column in the dataframe, and I have tried to pass this to ggplot
, but my solution does not work very well. Let me show you what I mean:
library(plyr); library(dplyr); library(ggplot2)
dat <- mtcars
dat$col <- mapvalues(dat$cyl, from=c(4,6,8), to=c("yellow", "red", "grey"))
q <- ggplot(dat, aes(x=gear, y=carb)) +
geom_bar(stat="identity", position="dodge", aes(fill=factor(cyl))) +
scale_fill_manual(values=unique(dat$col))
dat2 <- filter(dat, cyl>4)
p <- ggplot(dat2, aes(x=gear, y=carb)) +
geom_bar(stat="identity", position="dodge", aes(fill=factor(cyl))) +
scale_fill_manual(values=unique(dat2$col))
Comparing both graphs reveals that not the same colours were used. The problem lies in the fact that unique(dat$col)
does not necessarily give the factors in my desired order, so I am looking for a more robust solution. Most importantly, it seems that this should be easy and I feel that I am probably overlooking a really simple way to fix this, which is why I am asking here now. Does anyone have a good idea? Any hint would be appreciated!