0

I'm trying to learn purrr in the tidyverse and coming up short. I have a dataset that looks like:

DX1 DX2 DX3 DX4 DX5 DX6  ... DX26 
 2    2   2   2   4   7  ...  3
 4
 7    3   3   6   4
 3    4
 6

Where DX are various ICD9/10 codes, up to 26 total possible options. If there's no need to go past a given number of diagnoses, then the remaining DX variables are left blank.

I need to loop through all 26 DX variables and create a new variable where the value is 1 if there is any response of 4, and 0 if there is no response of 4. In other words, it should look like:

DX1 DX2 DX3 DX4 DX5 DX6  ... DX26 NewVar
 2    2   2   2   4   7  ...  3     1
 4                                  1
 7    3   3   6   4                 1
 3    4                             1
 6                                  0

Is there a simple way to have purrr do this? Thanks in advance for any advice!

camille
  • 16,432
  • 18
  • 38
  • 60
  • 1
    I don't know that `purrr` is necessary; you can probably just do this with `rowSums` – camille Dec 19 '19 at 14:09
  • With `purrr` you'd use `pmap` to process the df row by row. Something like `df$NewVar <- pmap_lgl(df, function(...) 4 %in% c(...) )`. Not as efficient as the rowSums solution. – asachet Dec 19 '19 at 14:15
  • Several more related posts [here](https://stackoverflow.com/q/17288222/5325862), [here](https://stackoverflow.com/q/46285484/5325862), [here](https://stackoverflow.com/q/45827337/5325862), [here](https://stackoverflow.com/q/55402065/5325862) – camille Dec 19 '19 at 14:20
  • Not using `purrr` but using `dplyr` would be `df %>% mutate(NewVar = ifelse(rowSums(. == 4) > 0, 1, 0))` – pgcudahy Dec 19 '19 at 21:43

1 Answers1

0

You can try the code below with rowSums() (assuming places without figures are filled with NA)

df$NewVar <- rowSums(df==4,na.rm = T)
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81