3

I would like to add the values of one column grouping them by two columns. I found how to do this on one column, but could not figure out how to do this on two columns. For example if I have the following data frame:

x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c") 
y=c(1:18) 
q=c("M","M","M", "M","M","M","W","W","W","W","W","W","F","F","F","F","F","F")
df<-data.frame(x,y,q)

I would like to add the values in y column across x and q, so that I have a new data frame like this one

x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c") 
y=c(3,7,11,15,19,23,27,31,35) 
q=c("M","M","M","W","W","W","F","F","F")
d<-data.frame(x,y,q)
Vasile
  • 1,017
  • 2
  • 10
  • 19

1 Answers1

5

You have several options:

1: Base R

aggregate(y ~ x + q, df, sum)

2: data.table

library(data.table)
setDT(df)[, .(sumy = sum(y)), by = .(x, q)]

# when you want to summarise several columns:
setDT(df)[, lapply(.SD, sum), by = .(x, q)]

3: dplyr

library(dplyr)
df %>% group_by(x, q) %>% summarise(sumy = sum(y))

# when you want to summarise several columns:
df %>% group_by(x, q) %>% summarise_each(funs(sum))

All should give you the same result (although not in the same order). For example, the data.table output looks like this:

   x q  y
1: a M  3
2: b M  7
3: c M 11
4: a W 15
5: b W 19
6: c W 23
7: a F 27
8: b F 31
9: c F 35
Jaap
  • 81,064
  • 34
  • 182
  • 193