Background
I've got a dataframe:
d <- data.frame(ID = c("a","a","b","b", "c","c"),
event = c(0,1,0,0,1,1),
event_date = as.Date(c("2011-01-01","2012-08-21","2011-12-23","2011-12-31","2013-03-14","2015-07-12")),
stringsAsFactors=FALSE)
It's got some people (ID
), a binary indicator of an event
, and an event_date
for that event
.
The Problem
I want to subset this dataframe to return the rows for those ID's who have never had the event -- in other words, for ID's whose every event
cell, no matter how many they have, equals zero. No exceptions.
In other words, I'd like to get something that looks like this:
ID event event_date
b 0 2011-12-23
b 0 2011-12-31
What I've tried
I've got a couple of ways of trying to do this, one in dplyr
and the other in data.table
(I'm more familiar with dplyr
; the latter example is adapted from an answer to an older question I posted). But both get me the same result: the inclusion of ID=b
, which is good, but also the inclusion of ID=a
, who has one row of event
=1 -- which is not good.
Attempt 1:
d_test <- subset(d, ID %in% unique(ID[event == 0])) %>%
arrange(ID, event_date)
ID event event_date
1 a 0 2011-01-01
2 a 1 2012-08-21
3 b 0 2011-12-23
4 b 0 2011-12-31
Attempt 2:
setDT(d)
d_test <- d[ID %in% d[,event==0, by=ID][V1==TRUE, ID]]
ID event event_date
1: a 0 2011-01-01
2: a 1 2012-08-21
3: b 0 2011-12-23
4: b 0 2011-12-31
Bottom line: I can't express NEVER or NONE in my code.