12

I have data with a grouping variable 'group' and a logical variable 'logic'.

library(data.table)
library(dplyr)

dt <- data.table(
    logic = c(TRUE, TRUE, FALSE, TRUE, TRUE, TRUE),
    group = c("A" , "A",  "A"  , "B" , "B" , "B")
)

I would like to filter groups, where all values in the 'logic' column are TRUE.

dplyr works as expected, and keeps only group = B

dt %>% 
  group_by(group) %>% 
  filter(all(logic))
# Source: local data table [3 x 2]
# Groups: group

#   logic group
# 1  TRUE     B
# 2  TRUE     B
# 3  TRUE     B

However, my attempts with data.table have failed, either bringing all table or nothing.

dt[all(logic), group, by = group]
# Empty data.table (0 rows) of 2 cols: group,group

dt[all(.SD$logic), group,by = group]
#    group group
# 1:     A     A
# 2:     B     B
Henrik
  • 65,555
  • 14
  • 143
  • 159
Cron Merdek
  • 1,084
  • 1
  • 14
  • 25
  • 2
    dplyr solution does not correspond to data.table solution. In dplyr you first made group and then filtering, in data.table you first filtered, then grouping. – jangorecki Dec 21 '15 at 12:12
  • @jangorecki, good point. How then `.SD` defined in the `dt[all(.SD$logic), group,by = group]`? – Cron Merdek Dec 21 '15 at 12:26
  • I don't understand your short question in last comment. You may want to make it a new SO question. – jangorecki Dec 21 '15 at 13:43
  • @jangorecki, i just have concerns regarding logic you mentioned. `.SD` is defined only when `by` is already evaluated. *.SD is a data.table containing the Subset of x's Data for each group, excluding any columns used in by (or keyby).* – Cron Merdek Dec 21 '15 at 14:00

2 Answers2

16

You could use [ as in

dt[, .SD[all(logic)], by = group]
#   group logic
#1:     B  TRUE
#2:     B  TRUE
#3:     B  TRUE
talat
  • 68,970
  • 21
  • 126
  • 157
8

We need to use if

dt[, if(all(logic)) .SD, by = group]
#    group logic
#1:     B  TRUE
#2:     B  TRUE
#3:     B  TRUE
akrun
  • 874,273
  • 37
  • 540
  • 662