I had an earlier post regarding how to delete ID if any of the rows within ID contain certain strings (e.g., A or D) from the following data frame in a longitudinal format. These are R code examples that I received from the earlier post (r2evans, akrun, ThomasIsCoding) in order:
- d %>% group_by(id) %>% filter(!any(dx %in% c("A", "D"))) %>% ungroup()
- filter(d, !id %in% id[dx %in% c("A", "D")])
- subset(d, !ave(dx %in% c("A", "D"), id, FUN = any))
While these all worked well, I realized that I had to remove more than 600 strings (e.g., A, D, E2, F112, G203, etc), so I created a csv file for the list of these strings without a column name. 1. Is it the right approach to make a list? 2. How should I modify the above R codes if I intend to use the file of the strings list? Although I reviewed the other post or Google search results, I could not figure out what to do with my case. I would appreciate any suggestions!
Data frame:
id time dx
1 1 C
1 2 B
2 1 A
2 2 B
3 1 D
4 1 G203
4 2 E1
The results I want:
id time dx
1 1 C
1 2 B
UPDATE: Tarjae's below answer resolved the issue. The following are R codes for the solution.
my_list <- read.csv("my_list.csv")
columnname
A
D
E2
F112
G203
- d %>% group_by(id) %>% filter(!any(dx%in%my_list$columnname)) %>% ungroup()
- filter(d, !id %in% id[dx %in% my_list$columnname])
- subset(d, !ave(dx %in% my_list$columnname, id, FUN = any))