-1

I have this database:

Time = c("2016-03-01","2016-03-02","2016-03-03","2016-03-02","2016-03-03","2016-03-02")
match = c("a","b","c","a","b","c") 
names = c("julien","julien","julien", "mathieu","mathieu","simon") 
df = data.frame(Time, names, match) 
df = df[order(Time),]
df
        Time   names match
1 2016-03-01  julien     a
2 2016-03-02  julien     b
4 2016-03-02 mathieu     a
6 2016-03-02   simon     c
3 2016-03-03  julien     c
5 2016-03-03 mathieu     b

And I want the cumulative number of match played by each player through time as a new column. I want to know, at any given time, how many matches each player have played. Like that:

        Time   names match nb.of.match.played
1 2016-03-01  julien     a                  1
2 2016-03-02  julien     b                  2
4 2016-03-02 mathieu     a                  1
6 2016-03-02   simon     c                  1
3 2016-03-03  julien     c                  3
5 2016-03-03 mathieu     b                  2 

It seemed easy to do, but I tried a coule of things with failed results every time. Thanks for your help!

  • show some of your code snippets that worked as best as possible and we can take a look at it – IceFire Mar 29 '16 at 19:52
  • Sorry, I don't have any tangible lead (code snippets) worth showing.... I tried this solution : http://stackoverflow.com/questions/22843286/r-cumulative-count-over-multiple-columns-by-factor but I did not think the objectives were similar. – Julien Céré Mar 29 '16 at 21:10

1 Answers1

1

I solve my problem with the trend cumsum using ddply

But I think cumsum does not work with length of factors, so I had a column of "1" on which cumsum can work.

Time = c("2016-03-01","2016-03-02","2016-03-03","2016-03-02","2016-03-03","2016-03-02")
match = c("a","b","c","a","b","c")
names = c("julien","julien","julien", "mathieu","mathieu","simon")
df = data.frame(Time, names, match) 
df = df[order(Time),]
df$nb = 1
df
        Time   names match nb
1 2016-03-01  julien     a  1
2 2016-03-02  julien     b  1
4 2016-03-02 mathieu     a  1
6 2016-03-02   simon     c  1
3 2016-03-03  julien     c  1
5 2016-03-03 mathieu     b  1

within(df, {
  nb.match <- ave(nb, names, FUN = cumsum)
})
df
        Time   names match nb nb.match
1 2016-03-01  julien     a  1        1
2 2016-03-02  julien     b  1        2
4 2016-03-02 mathieu     a  1        1
6 2016-03-02   simon     c  1        1
3 2016-03-03  julien     c  1        3
5 2016-03-03 mathieu     b  1        2
Community
  • 1
  • 1