17

I'm learning to use ggplot2 and am looking for the smallest ggplot2 code that reproduces the base::plot result below. I've tried a few things and they all ended up being horrendously long, so I'm looking for the smallest expression and ideally would like to have the dates on the x-axis (which are not there in the plot below).

df = data.frame(date = c(20121201, 20121220, 20130101, 20130115, 20130201),
                val  = c(10, 5, 8, 20, 4))
plot(cumsum(rowsum(df$val, df$date)), type = "l")
Prradep
  • 5,506
  • 5
  • 43
  • 84
eddi
  • 49,088
  • 6
  • 104
  • 155

3 Answers3

35

Try this:

ggplot(df, aes(x=1:5, y=cumsum(val))) + geom_line() + geom_point()

enter image description here

Just remove geom_point() if you don't want it.

Edit: Since you require to plot the data as such with x labels are dates, you can plot with x=1:5 and use scale_x_discrete to set labels a new data.frame. Taking df:

ggplot(data = df, aes(x = 1:5, y = cumsum(val))) + geom_line() + 
        geom_point() + theme(axis.text.x = element_text(angle=90, hjust = 1)) + 
        scale_x_discrete(labels = df$date) + xlab("Date")

enter image description here

Since you say you'll have more than 1 val for "date", you can aggregate them first using plyr, for example.

require(plyr)
dd <- ddply(df, .(date), summarise, val = sum(val))

Then you can proceed with the same command by replacing x = 1:5 with x = seq_len(nrow(dd)).

Arun
  • 116,683
  • 26
  • 284
  • 387
  • Thanks, that's pretty long compared to the `plot`. Can `ggplot` at least redeem itself by adding dates on the `x` axis, probably rotated by 90 degrees? :) – eddi Apr 08 '13 at 14:53
  • quote: "and ideally would like to have the dates on the x axis (which are not there in the plot below)." – eddi Apr 08 '13 at 20:31
  • I also tried editing your post to fix it for the case where there are more than 1 val's per date, but the edit hasn't gone through. – eddi Apr 08 '13 at 20:32
  • Thanks! Just add another entry to the `df` with the same date and some `val` and see the `plot` output. It'll be the cumulative daily sum of `val`. – eddi Apr 08 '13 at 20:42
  • I'll mark this answered, but please fix the `rowsum` bit for future reference. Can't say this is pretty, but thanks :) – eddi Apr 08 '13 at 20:46
  • I'm not sure what your confusion is about. It's two different values for the same date and I'm interested in the cumulative sum plot of values, but indexed by date, as opposed to observation (which is what you did). I updated my post with a slightly modified version of your answer, with that issue fixed. Ok, I'll bite, what is `ggplot2` *really* all about? – eddi Apr 08 '13 at 21:21
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/27799/discussion-between-eddi-and-arun) – eddi Apr 08 '13 at 21:26
7

After a couple of years, I've settled on doing:

ggplot(df, aes(as.Date(as.character(date), '%Y%m%d'), cumsum(val))) + geom_line()
eddi
  • 49,088
  • 6
  • 104
  • 155
3

Jan Boyer seems to have found a more concise solution to this problem in this question, which I have shortened a bit and combined with the answers of Prradep, so as to provide a (hopefully) up-to-date-answer:

ggplot(data = df, 
   aes(x=date)) +
geom_col(aes(y=value)) +
geom_line(aes(x = date, y = cumsum((value))/5, group = 1), inherit.aes = FALSE) +
ylab("Value") + 
theme(axis.text.x = element_text(angle=90, hjust = 1))

Note that date is not in Date-Format, but character, and that value is already grouped as suggested by Prradep in his answer above.

Lukas
  • 424
  • 3
  • 6
  • 17