1

I would like to know how to do a conditional sum in R. Let's say I want only the sum of Store ID = 111111 in the year 2012. How would I do this?

YEAR       STORE ID       AMOUNT
2011       111111         11
2011       222222         12
2012       111111         4 
2012       222222         4 
2012       111111         45
2012       333333         7
AJC
  • 133
  • 1
  • 2
  • 12
  • My comment and the answer to your previous question [here](http://stackoverflow.com/questions/27759867/omitting-and-finding-average-in-r) were difficult to extend from `mean` to `sum`? – talat Jan 06 '15 at 22:46

3 Answers3

2

You can subset the data.frame then sum the resulting column. Call that data.frame mydf

sum(mydf[ mydf$'STORE ID' == 111111 & mydf$YEAR == 2012, 3])
JeremyS
  • 3,497
  • 1
  • 17
  • 19
1

A solution using data.table package in case:

require(data.table)
setkey(dt, YEAR, STORE_ID)[.(2012, 111111), sum(AMOUNT)]

We first setkey() which sorts the data.table in increasing order based on the columns provided and marks those column as sorted so that we can use *fast binary search based joins, which we do in the next step with .(2012, 111111) which looks for 2012 in the first key column = YEAR and 111111 in the second = STORE_ID using binary search. And on those rows, we compute the sum of column AMOUNT.

You can also do it the typical base R way (which uses vector scan as opposed to binary search):

dt[YEAR == 2012 & STORE_ID == 1, sum(AMOUNT)]
Arun
  • 116,683
  • 26
  • 284
  • 387
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
1

Another solution using the dplyr and magrittr package:

library(dplyr)
library(magrittr)

y <- data_frame(YEAR = c(2011, 2011, 2012, 2012, 2012, 2012), 
                STORE_ID = c(1, 2, 1, 2, 1, 3), AMOUNT = c(11, 12, 4, 4, 45, 7))

y %>% filter(YEAR == 2012, STORE_ID == 1) %$% sum(AMOUNT)
alex23lemm
  • 5,475
  • 1
  • 21
  • 23