How to sum a variable by group but do not aggregate the data frame in R?

Asked Nov 14 '18 at 14:21

Active Nov 14 '18 at 14:32

Viewed 1,250 times

although I have found a lot of ways to calculate the sum of a variable by group, all the approaches end up creating a new data set which aggregates the double cases.

To be more precise, if I have a data frame:

and I want to count the number of times I have the same ID by the different years, there are a lot of ways (using aggregate, tapply, dplyr, sqldf etc) which use a "group by" kind of functionality that in the end will give something like:

I haven't managed to find a way to calculate the same thing but keep my original data frame, in order to obtain:

 id  year   count  
 1   2010     3
 1   2015     3
 1   2017     3
 2   2011     2
 2   2017     2
 3   2015     1

and therefore do not aggregate my double cases. Has somebody already figured out? Thank you in advance

asked Nov 14 '18 at 14:21

am.nik

1

`dplyr --> mutate`, `data.table --> :=`, `baseR --> ave` – Sotos Nov 14 '18 at 14:23
In `dplyr`, you can use `df %>% add_count(id)` or `df %>% group_by(id) %>% mutate(count = n())`. – tmfmnk Nov 14 '18 at 14:25
1st use `rle` function (or others) to calculate your second table. 2nd merge the first and the second tables by the 'id' column to get the third table – Bastien Nov 14 '18 at 14:28
did u got the solution? – sai saran Nov 14 '18 at 14:29
@sai saran yes, thanks to tmfmnk i did the follow using dplyr : df %>% add_count(id)%>% group_by(id, year) Thank you all – am.nik Nov 14 '18 at 16:29

How to sum a variable by group but do not aggregate the data frame in R?

0 Answers0