R create a column based on duplicate values of one column, and a second column

Question

I have two columns. One that has several duplicate values (col A) (like 10, 10, 20, 5, 10, 20, etc...). The other (col B) is a binary (0/1) variable. I need to get R to first sort the first column A, if necessary, and then look at all the duplicate values, and their corresponding values in the second column, B. Then, for each set of duplicate values in col A, I need to sum the values in col B. So, if there are 5 10s in col A, then I need to sum the 1s in col B associated with each of these 5 10s.

How do I do this?

Thanks.

Please take the time to create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Include sample input and desired output. Show any code that you may have tried so far and describe exactly where you are getting stuck. — MrFlick, Jun 19 '15 at 15:24

score 3 · Accepted Answer · answered Jun 19 '15 at 15:30

3

You want an aggregation:

aggregate(B~A, df, FUN=sum)

answered Jun 19 '15 at 15:30

Neal Fultz

9,282
1
39
60

score 0 · Answer 2 · answered Jun 19 '15 at 15:43

0

df = data.frame(A = c(5,10, 5, 10), B=c(0,1,1,1))
tapply(df$B, df$A, sum)
#  5 10 
#  1  2

the solution by Neal presents the result in a nicer way:

aggregate(B~A, df, FUN=sum)
#    A B
# 1  5 1
# 2 10 2

answered Jun 19 '15 at 15:43

mts

2,160
2
24
34

R create a column based on duplicate values of one column, and a second column

2 Answers2