0

I have a longitudinal dataset with 3 important variables: ID, Year, Treatment

I would like to keep all the IDs that get treated at some point of time and drop all the IDs that never get treated. How do I do this on R?

Example:

ID Year Treatment
0001 2000 0
0001 2001 0
0001 2002 0
0002 2000 0
0002 2001 0
0002 2002 1

I would like to keep all observations of ID 0002 (Treated at some point in time), but drop all of ID 0001 (Never treated). I have a very big dataset with more IDs than that so I can not do this manually.

Thanks in advance.

Elias W
  • 1
  • 1

1 Answers1

0

Find the IDs that have treatment, then subset those IDs:

d[ d$ID %in% unique(d[ d$Treatment == 1,  "ID" ]), ]
#     ID Year Treatment
# 4 0002 2000         0
# 5 0002 2001         0
# 6 0002 2002         1
zx8754
  • 52,746
  • 12
  • 114
  • 209