This question is about recoding of polytomous variables in a large data set. As the data is large and recoding is to be done on many variables, I am looking for a more flexible way to call all the relevant variables and perform the recoding. There are many resolved issues about recoding (e.g. Recoding multiple variables in R), but these do not fit into the specifics of this question. Below is an example of the data:
df<-data.frame("id"=c(1:5),
"ax1"=c(2,1,4,3,4),
"ax2"=c(7,3,6,2,2),
"bx1"=c(3,5,7,1,2),
"bx2"=c(1,3,1,5,2),
"cx1"=c(1,1,7,1,6),
"cx2"=c(3,9,5,5,4))
For instance, I would like to recode ax1
, bx1
and cx1
. On these variables, I want to recode 1, 2, 3, 4 as 0, 1, 1, 0, respectively and recode as NA
otherwise. With the use of the 'dplyr' package I tried
df <- df %>%
mutate_at( vars(ends_with("x1")),
list(~ ifelse( . == 1, 0, ifelse(.== 2, 1, ifelse(.==3, 1, ifelse(.==4, 0,NA))))))
However, this does not produce the expected output. The expected output would look like
id ax1 ax2 bx1 bx2 cx1 cx2
1 1 1 7 1 1 0 3
2 2 0 3 NA 3 0 9
3 3 0 6 NA 1 NA 5
4 4 1 2 0 5 0 5
5 5 0 2 1 2 NA 4