I have a dataset that has over 10 million variables. It has 25 columns for diagnosis code. Each row represents one patient. Based on admission to the hospital Diagnosis code is generated and it can be placed anywhere between columns 1 to 25. I would like to create a new column for lets say disease A that has codes ranging from 480,481,482,483,484,V334,V555. If any of these codes are present in these 25 columns new column should return value of 1 else 0.
Original Table:
DX1 | DX2 | DX3 | DX4. | DX5. | DX6. | DX7. | DX8. | DX9. | DX10. | DX11. |
---|---|---|---|---|---|---|---|---|---|---|
481 | 482 | |||||||||
V334 | ||||||||||
484 | ||||||||||
485. | ||||||||||
V555 | 481 | |||||||||
F666 | ||||||||||
G765 | 481 |
I want new column with 0 or 1 variable even if one of the row has values: 480,481,482,483,484,V334,V555. For e.g.
DX1 | DX2 | DX3 | DX4. | DX5. | DX6. | DX7. | DX8. | DX9. | DX10. | DX11. | NewCol |
---|---|---|---|---|---|---|---|---|---|---|---|
481 | 482 | 1 | |||||||||
V334 | 0 | ||||||||||
484 | 1 | ||||||||||
485. | 1 | ||||||||||
V555 | 481 | 1 | |||||||||
F666 | 0 | ||||||||||
G765 | 481 | 1 |
I have tried using:
%>%mutate(NewCol = recode(DX1,
"486" = "1",
.default = "0"))
but this is only for column DX1 and variable 486. I want to apply for 25 consecutive columns ranging from DX1:DX25 and include all variables 480,481,482,483,484,V334,V555. Thank You!