This is somehow related to R+ggplot+geom_bar+scale_x_continuous+limits: leftmost and rightmost bars not showing on plot but I decided to do a different post because of the different question.
I need to create two different versions of plots from a dataset: one containing all values for a specific column and other with filtered values. I guess it is easier to see that considering my data frame:
consts = paste('"Category","Year","Name","Quantity"\n',
'SHEEP,2003,Alice,10\n',
'SHEEP,2005,Alice,3\n',
'SHEEP,2008,Alice,2\n',
'SHEEP,2009,Alice,1\n',
'SHEEP,2012,Alice,3\n',
'CACTUS,1997,Bob,45\n',
'CHICKEN,1997,Bob,6\n',
'SHEEP,1998,Bob,2\n',
'SHEEP,2005,Bob,5\n',sep = "")
data <- read.csv(text=consts,header = TRUE)
Suppose I need to see how many animals (and vegetables :-) were sheared by year. To reuse code I wrote a function that receives as parameters the data set and an optional name, and plots the data as a stacked bar with Year as the X axis, Quantity as the Y axis and using Category to create the different parts of the stack. The function is:
# Plot either all data or select by name.
doPlot <- function(data,name=character(0)) {
# If we pass a name as parameter we need to change the output file name, the plot title
# and subset the data.
fname = sprintf("Performance.png")
title = "Performance"
if(length(name)!=0) {
fname = sprintf("Performance-%s.png",name)
title = paste(title," - ",name)
data <- subset(data,Name == name)
}
byYear <- aggregate(Quantity ~ Year+Category, data, sum)
byYear = ddply(byYear, "Year", mutate, label_y = cumsum(Quantity))
png(filename=fname,width = 960, height = 640)
g <- ggplot(byYear, aes(x=Year,y=Quantity))
g <- g + geom_bar(stat="identity",aes(fill=Category), colour="black") +
ggtitle(title) +
scale_fill_discrete("Category",labels=c("Sheep","Cactus","Chicken"),c=45, l=80)+
scale_x_continuous(name="Year", limits=c(1996,2013), breaks=seq(1996,2013,1)) +
mytheme+
geom_text(aes(label=Quantity,y=label_y), vjust=1.3,size=6)
print(g)
dev.off()
}
If you want to reproduce the plots you will also need some constants for the theme:
# Colors and themes for the plot
goodBlue <- "#7fbfff"
darkBlue <- "#3f5f7f"
mytheme <- theme(plot.title = element_text(color=darkBlue,face="bold",size=20),
axis.title.x = element_text(color=darkBlue,face="bold",size=16),
axis.title.y = element_text(color=darkBlue,face="bold",size=16),
axis.text.x = element_text(color=darkBlue,face="bold",size=14),
axis.text.y = element_text(color=darkBlue,face="bold",size=12),
legend.title = element_text(color=darkBlue,face="bold",size=18),
legend.text = element_text(color=darkBlue,face="bold",size=12))
Calling
doPlot(data)
Gives the following plot:
Not exactly what I want: note that the first category was labeled "Sheep"!
I cannot get the desired results when I filter and plot the data with a call to:
doPlot(data,"Alice")
Here is the plot:
Legends/colors are correct: all Alice ever sheared were sheeps.
What I wanted was:
Ensure that every plot have the same caption, in the order I want them to appear (Sheeps, Cactus, Chicken) with the correct colors on the caption and bars;
Ensure that the caption will appear with all entries even if they are not present in the data being plotted. E.g., in the second plot I will have the same legend as in the first (readers would notice that Cactus/Chicken were part of the data but Alice didn't sheared any).
thanks in advance
EDIT: I can solve item 1 by enforcing an order to the factors:
data$Category <- factor(data$Category, levels = c("SHEEP", "CACTUS", "CHICKEN"))