0

I have an if loop over multiple data.frame-s. I subset the data based on one of the variables and sum the entries based on another variable. My problem is that the selection sometimes returns a zero element tibble and so the sum function throws an error. Is there an elegant way to make the sum function execute these cases and consider the empty tibble as zero? It works nicely with a data.frame but does not with a tibble. I am sure there should be a solution with tibble as well.

#tibble
data<- tibble:: tibble("A" = c(1, 1, 2), "B"= c(2, 2, 1), "C"= c(10, 20, 30)) 
good<- data[data$A == 1,]
sum(good[good$B == "1",'C'])

#data.frame
data<- data.frame ("A" = c(1, 1, 2), "B"= c(2, 2, 1), "C"= c(10, 20, 30))
good<- data[data$A == 1,]
sum(good[good$B == "1",'C'])
Enoana
  • 55
  • 2
  • 6
  • `aggregate(C ~ A + B, data=data, FUN=sum)` or `tapply(data$C, list(data$A, data$B), FUN=sum)`; please read https://stackoverflow.com/questions/3505701/grouping-functions-tapply-by-aggregate-and-the-apply-family – jogo Mar 19 '19 at 10:06
  • you can use chain operators `data %>% summarise(total = sum(C[A == 1 & B == 1])) %>% pull(total)` – Ronak Shah Mar 19 '19 at 10:11

1 Answers1

0

Using data.table

library(data.table)
setDT(data)[, .(C = sum(C)), .(A, B)]

if we need to have a comparison

setDT(data)[A== 1 & B == 1, .(total = sum(C))]
akrun
  • 874,273
  • 37
  • 540
  • 662