5

Borrowing example from Plotting cumulative counts in ggplot2

x <- data.frame(A=replicate(200,sample(c("a","b","c"),1)),X=rnorm(200))
ggplot(x,aes(x=X,color=A)) + stat_bin(aes(y=cumsum(..count..)),geom="step")

enter image description here

As you can see, cumsum work across groups & facets. I am wondering why it does that? Clearly ..count.. is done within groups, why cumsum is not when applied on to ..count..? Does ggplot internally cat all ..count.. into a vector and then apply cumsum to it?

How to correctly resolve it without pre processing, e.g. using plyr?

And I don't mind geom is not step, it can be line or even bar as long as the graph is a cumulative plot.

Community
  • 1
  • 1
colinfang
  • 20,909
  • 19
  • 90
  • 173
  • Read `?stat_bin`. It returns a data.frame and you access one of the data.frame columns with `..count..`. – Roland Oct 15 '13 at 12:43
  • 4
    I've read ?stat_bin.And as often happens in ggplot documentation it says nothing how to get the statistical function applied by groups. The line "See _layer_" is sprinkled liberally around ggplot2 help pages and what you get at ?layer is laughably brief. – IRTFM Oct 15 '13 at 13:22
  • 3
    Basically, you can't. Keep in mind that one of the basic design principles of ggplot is that you manipulate your data into the right shape and _then_ call ggplot. – joran Oct 15 '13 at 14:04

1 Answers1

1

Here's how I handle this with one line of code (ddply and mutate):

df <- data.frame(x=rnorm(1000),kind=sample(c("a","b","c"),1000,replace=T),
         label=sample(1:5,1000,replace=T),attribute=sample(1:2,1000,replace=T))

dfx <- ddply(df,.(kind,label,attribute),mutate,cum=rank(x)/length(x))

ggplot(dfx,aes(x=x))+geom_line(aes(y=cum,color=kind))+facet_grid(label~attribute)
PeterK
  • 1,185
  • 1
  • 9
  • 23