R subsetting by unique observation and prioritizing a value

Question

I have a coding problem regarding subsetting my dataset. I would like to subset my data with the following conditions (1) one observation per ID and (2) retaining a row for "event" = 1 occurring at any time, while still not losing any observations.

An example dataset looks like this:

 ID event
 A  1
 A  1
 A  0
 A  1
 B  0
 B  0
 B  0
 C  0
 C  1

Desired output

 A  1
 B  0
 C  1

I imagine this would be done using dplyr df >%> group_by(ID), but I'm unsure how to prioritize selecting for any row that contains event = 1 without losing when event = 0. I do not want to lose any of the IDs.

Any help would be appreciated - thank you very much.

score 1 · Accepted Answer · answered Jan 29 '23 at 21:58

We may use

aggregate(event ~ ID, df1, max)
   ID event
1  A     1
2  B     0
3  C     1

Or with dplyr

library(dplyr)
df1 %>%
   group_by(ID) %>%
   slice_max(n = 1, event, with_ties = FALSE) %>%
   ungroup
# A tibble: 3 × 2
  ID    event
  <chr> <int>
1 A         1
2 B         0
3 C         1

data

df1 <- structure(list(ID = c("A", "A", "A", "A", "B", "B", "B", "C", 
"C"), event = c(1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L)), 
class = "data.frame", row.names = c(NA, 
-9L))

R subsetting by unique observation and prioritizing a value

1 Answers1

data