0

I'm trying to use the sapply (or similiar) function to sum all of the values that match multiple criteria throughout the data set.

I was able to write the code for a specific match, but am not sure how to use R to apply to every unique match in the data frame.

For example, if my data frame is constructed with 3 columns

col1 <- c("a", "a", "a", "b", "b", "b", "b", "b", "b")
col2 <- c(1, 1, 1, 2, 2, 2, 1, 1, 1)
col3 <- c(10, 5, 10, 5, 5, 1, 3, 4, 5)
df <- data.frame(col1, col2, col3)

Here is the code I'm using for one match:

tmp <- subset(df, col1 == "a" & col2==1)
sum(tmp[,3])

This code correctly returns 25 for the sum of col3 matching the 2 criteria in the subset function.

How do I do this calculation for the 3 unique combinations in the data frame? I'm looking for the following output

col1  col2 sum_col3
a     1    25
b     1    12
b     2    11

Thanks for assistance in advance.

Sotos
  • 51,121
  • 6
  • 32
  • 66
jacoby
  • 31
  • 1
  • 4
  • 1
    Standard approach in base R would be `?aggregate`. Also take a look here: http://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group – talat Sep 09 '16 at 06:47

1 Answers1

0

Here's what you can try :

> result <- aggregate(col3 ~ col1 + col2 , df, sum)
> result
  col1 col2 col3
1    a    1   25
2    b    1   12
3    b    2   11
Pankaj Kaundal
  • 1,012
  • 3
  • 13
  • 25