8

For the following dataset:

d = data.frame(date = as.Date(as.Date('2015-01-01'):as.Date('2015-04-10'), origin = "1970-01-01"),
               group = rep(c('A','B','C','D'), 25), value = sample(1:100))
head(d)
         date group value
1: 2015-01-01     A     4
2: 2015-01-02     B    32
3: 2015-01-03     C    46
4: 2015-01-04     D    40
5: 2015-01-05     A    93
6: 2015-01-06     B    10

.. can anyone advise a more elegant way to calculate a cumulative total of values by group than this data.table) method?

library(data.table)
setDT(d)
d.cast = dcast.data.table(d, group ~ date, value.var = 'value', fun.aggregate = sum)
c.sum = d.cast[, as.list(cumsum(unlist(.SD))), by = group]

.. which is pretty clunky and yields a flat matrix that needs dplyr::gather or reshape2::melt to reformat.

Surely R can do better than this??

Community
  • 1
  • 1
geotheory
  • 22,624
  • 29
  • 119
  • 196
  • 3
    I'm confused. What you describe in words is `setDT(d)[,cumsum(value),by=group]` – Frank May 22 '15 at 15:04
  • 2
    You should really use `set.seed()` to make the example [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and you should also include the desired output for your sample input. – MrFlick May 22 '15 at 15:04

4 Answers4

10

If you just want cumulative sums per group, then you can do

transform(d, new=ave(value,group,FUN=cumsum))

with base R.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
8

This should work

library(dplyr)
d %>% 
  group_by(group) %>% 
  arrange(date) %>% 
  mutate(Total = cumsum(value))
Akhil Nair
  • 3,144
  • 1
  • 17
  • 32
4

As this question was tagged with data.table, you are probably looking for (a modification of @Franks comment).

setDT(d)[order(date), new := cumsum(value), by = group]

This will simultaneously rearrange the data by date (not sure if needed, if not, you can get rid of order(date)) and update your data set in place utilizing the := operator

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
0

Is this it?

sp <- split(d, d$group)
res <- lapply(seq_along(sp), function(i) cumsum(sp[[i]]$value))
res <- lapply(seq_along(res), function(i){
        sp[[i]]$c.sum <- res[[i]]
        sp[[i]]
    }) 
res <- do.call(rbind, res)
res
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66