I'd like to use ggplot to generate a series of boxplots derived from all data within a dataset, but then with jittered points showing a random sampling of the respective data (e.g., 100 data points) to avoid over-plotting (there are thousands of data points). Can anyone please help me with the code for this? The basic framework I have now is below, but I don't know what if any arguments can be added to draw a random sampling of data to display as the jittered points. Thanks for any help.
ggplot(datafile, aes(x=factor(var1), y=var2, fill=var3)) + geom_jitter(size=0.1, position=position_jitter(width=0.3, height=0.2)) + geom_boxplot(alpha=0.5) + facet_grid(.~var3) + theme_bw() + scale_fil_manual(values=c("red", "green", "blue")