0

I have the following code in R

set.seed(12048)
name <- sample(letters[1:3], 10, replace=T)
df <- data.frame(name, stringsAsFactors = F)
df$score <- sample(0:1, nrow(df), replace=T)
df$rank <- as.numeric(ave(df$name, df$name, FUN=seq_along))
v <- by(df$score, df$name, cumsum)

This produces the following table

   name score rank
1     b     0    1
2     a     1    1
3     a     1    2
4     c     1    1
5     c     1    2
6     a     0    3
7     a     1    4
8     b     0    2
9     c     1    3
10    c     1    4

I am now trying to make a table like so in which the cumsum column is the count of 1s in 'score' column per 'name' like so:

   name score rank  cumsum
1     b     0    1       0
2     a     1    1       1
3     a     1    2       2
4     c     1    1       1
5     c     1    2       2
6     a     0    3       2
7     a     1    4       3
8     b     0    2       0
9     c     1    3       2
10    c     1    4       3

I tried this and I get the cumsums correctly, but I can't figure out how to merge back into my data frame df in 'order'

> dftable <- as.data.table(df)
> dfn <- dftable[,list(cumsum = cumsum(score)),by=list(name)]
> dfn
    name cumsum
 1:    b      0
 2:    b      0
 3:    a      1
 4:    a      2
 5:    a      2
 6:    a      3
 7:    c      1
 8:    c      2
 9:    c      3
10:    c      4

Any help is highly appreciated.

user3701522
  • 307
  • 3
  • 12
  • 3
    Use `cumsum := cumsum(score), by=name` and maybe check out the vignettes for the package. In base R, you can use `ave`. – Frank May 01 '17 at 03:42

1 Answers1

2

We can use mutate from dplyr

 library(dplyr)
 df %>% 
    group_by(name) %>%
    mutate(cumsum = cumsum(score))

If we want to use base R, one option is ave as @Frank mentioned in the comments

df$cumsum <- with(df, ave(score, name, FUN = cumsum))

NOTE: It is better not to name objects with function names i.e. cumsum can be Cumsum1 or other names to avoid having problems in the future

akrun
  • 874,273
  • 37
  • 540
  • 662