1

This question is about recoding of polytomous variables in a large data set. As the data is large and recoding is to be done on many variables, I am looking for a more flexible way to call all the relevant variables and perform the recoding. There are many resolved issues about recoding (e.g. Recoding multiple variables in R), but these do not fit into the specifics of this question. Below is an example of the data:

df<-data.frame("id"=c(1:5),
           "ax1"=c(2,1,4,3,4),
           "ax2"=c(7,3,6,2,2),
           "bx1"=c(3,5,7,1,2),
           "bx2"=c(1,3,1,5,2),
           "cx1"=c(1,1,7,1,6),
           "cx2"=c(3,9,5,5,4)) 

For instance, I would like to recode ax1, bx1 and cx1. On these variables, I want to recode 1, 2, 3, 4 as 0, 1, 1, 0, respectively and recode as NA otherwise. With the use of the 'dplyr' package I tried

df <- df %>%  
 mutate_at( vars(ends_with("x1")),
         list(~ ifelse( . == 1, 0, ifelse(.== 2, 1, ifelse(.==3, 1, ifelse(.==4, 0,NA))))))

However, this does not produce the expected output. The expected output would look like

   id ax1 ax2 bx1 bx2 cx1 cx2
1  1   1   7   1   1   0   3
2  2   0   3  NA   3   0   9
3  3   0   6  NA   1  NA   5
4  4   1   2   0   5   0   5
5  5   0   2   1   2  NA   4
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
T Richard
  • 525
  • 2
  • 9
  • 1
    ?? I get the correct result when I run your code (as long as I add a missing close-parenthesis at the end) – Ben Bolker Nov 04 '19 at 01:43

2 Answers2

3

In dplyr, there is a recode function specifically for this

library(dplyr)
df %>%  
   mutate_at(vars(ends_with("x1")),
        ~recode(., `1` = 0, `2` = 1, `3` = 1, `4` = 0, .default = NA_real_)))

#  id ax1 ax2 bx1 bx2 cx1 cx2
#1  1   1   7   1   1   0   3
#2  2   0   3  NA   3   0   9
#3  3   0   6  NA   1  NA   5
#4  4   1   2   0   5   0   5
#5  5   0   2   1   2  NA   4
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
3

Another possibility:

df %>% mutate_at(vars(ends_with("x1")), 
    ~case_when(. %in% c(1,4) ~ 0,
               . %in% c(2,3) ~ 1))

(Not sure why you need the list() in there?)

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453