0

I want to sum numbers in B column based on numbers in A column. For example:

Column A : 2001 2002 2002 2002 2003 2003

Column B: 1 2 3 4 5 6

I want to add a column C that sums up B based on A. My desired result is:

Column A : 2001 2002 2002 2002 2003 2003

Column B: 1 2 3 4 5 6

Column C: 1 9 (2+3+4) 9 9 11 11

I have done a lot of search but really have no clue where to begin, thanks in advance for any help!

Amy
  • 33
  • 7

1 Answers1

1

We can use mutate from dplyr after grouping by 'A'

library(dplyr)
df1 %>%
    group_by(A) %>%
    mutate(C= sum(B))

Or with ave from base R

df1$C <- with(df1, ave(B, A, FUN = sum))

An efficient option is data.table

library(data.table)
setDT(df1)[, C := sum(B), by = A]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • May I ask you one more question? What if there are two criteria, I need to add column C first based on A, then based on D, I tried library(dplyr) df1 %>% group_by(A) , group_by(D)%>% mutate(C= sum(B)), and I put the group_by(D) to many places, but did not work out. – Amy Dec 09 '16 at 04:23
  • @Amy Two criteria, means two columns? If that is the case `df1 %>% group_by(A, someothercolumn) %>% mutate(C = sum(B))` – akrun Dec 09 '16 at 04:25