I have a huge list of data that contains information about person and its product complaint reports submitted to FDA for foods, dietary supplements, and cosmetics. My data is cleaned up and then I create the matrix that contains 0 and 1:
syms <- strsplit(dat$symptoms, ", ")
tm <- matrix(0, nrow=nrow(dat), ncol=length(unique(unlist(syms))))
colnames(tm) <- unique(unlist(syms))
for(i in 1:length(syms)) {
tm[i, syms[[i]]] <- 1
}
dat$symptoms <- NULL
The 'dat' contains data of complaints of the patient:
received | id | ... | product | outcome |
---|---|---|---|---|
9/30/2022 | 2022-CFS-014640 | ... | centrum silver men's 50+ | other outcome |
9/30/2022 | 2022-CFS-014637 | ... | liquid collagen shot | life threatening |
and the 'tm' has the matrix of symptoms:
diarrhoea | vomiting | cancer |
---|---|---|
0 | 1 | 0 |
1 | 0 | 0 |
... | ... | 1 |
I need to find the list of products that person should avoid if it doesn't want to get cancer. I tried this:
# Find rows in tm matrix where the "cancer" symptom is present
cancer_rows <- which(tm[, "cancer"] == 1)
# Create a vector of product names associated with "cancer" symptoms
products_to_avoid <- unique(dat$product[cancer_rows])
but this doesn't work for me. Maybe someone has any ideas how can I write it properly?