3

I am trying to recode several variables but with different recode schemes. The recoding scheme is saved in a list where each element is a named vector of the form old = new. Each element is the recoding scheme for each variable in the data frame

I am using the mutate_at function and the recode.

I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list

I tried deparse(substitute(.)) as in here and also this didn;t help

Also I saw here that I can extract the column name of the variable that is passed with tidyevalution but I again failed to implement it. (also it is using the deprecated 'funs`)

Last, I am hoping that this is the correct approach to recode the variables (i.e. using this recode list inside the mutate). If there is totally different way to approach this multiple recoding please let me know

library(dplyr)
# dplyr version 0.8.5

df <- 
  tibble(
    var1 = c("A", "A", "B", "C"),
    var2 = c("X", "Y", "Z", "Z")
  )

recode_list <- 
  list(

    var1 = c(A = 1, B = 2, C = 3),
    var2 = c(X = 0, Y = -1, Z = 1)
  )

recode_list
#> $var1
#> A B C 
#> 1 2 3 
#> 
#> $var2
#>  X  Y  Z 
#>  0 -1  1

I am using the dplyr::recode function.


# recoding works fine when doing it one variable as a time
df %>% 
  mutate(
    var1 = recode(var1, !!!recode_list[["var1"]]),
    var2 = recode(var2, !!!recode_list[["var2"]])
  )
#> # A tibble: 4 x 2
#>    var1  var2
#>   <dbl> <dbl>
#> 1     1     0
#> 2     1    -1
#> 3     2     1
#> 4     3     1

When I try to apply a function to do this for all variables, it seems to fail

# this does not work.
df %>%
  mutate_at(vars(var1, var2), ~{

    var_name <- rlang::quo_name(quo(.))

    recode(., !!!recode_list[[var_name]])
  }
  )
#> Error in expr_interp(f): object 'var_name' not found

I also tried rlang::as_name and rlang::as_label but I think I cannot really capture the name of the variable as a string to use it to subset the recode_list.


df %>%
  mutate_at(vars(var1, var2), ~ {
    var_name <- rlang::as_name(quo(.))
    print(var_name)
    #recode(., !!!recode_list[["var2"]])
  }
  )
#> [1] "."
#> [1] "."
#> # A tibble: 4 x 2
#>   var1  var2 
#>   <chr> <chr>
#> 1 .     .    
#> 2 .     .    
#> 3 .     .    
#> 4 .     .


Created on 2020-04-30 by the reprex package (v0.3.0)
Lefkios Paikousis
  • 462
  • 1
  • 6
  • 12

1 Answers1

1

Does this work for you?

library(dplyr)
library(rlang)
df %>% 
  mutate_at(vars(var1,var2),
            .funs = function(x){recode_list %<>% .[[as_label(enquo(x))]]
            recode(x,!!!recode_list)})
## A tibble: 4 x 2
#   var1  var2
#  <dbl> <dbl>
#1     1     0
#2     1    -1
#3     2     1
#4     3     1

I suspect this works while placing the subset recode_list directly into recode does not is because enquo delays evaluation of x until assignment with %<>%. Then !!! can force evaluation after it has been properly evaluated previously.

Edit

Your approach with rlang also works with some modifications:

library(rlang)
df %>%
  mutate_at(vars(var1, var2), function(x) {
    var_name <- rlang::as_label(substitute(x))
    recode(x, !!!recode_list[[var_name]])
  })
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • Thanks @Ian ! It works! much appreciated! (I still don't understand why we should use `enquo` (or `substitute` in base R) instead of `quo`. Also why the `function(x)` works while the `~` does not) – Lefkios Paikousis Apr 30 '20 at 13:30
  • Also, `quo_name` is deprecated and you can replace it with `as_name`or `as_label`. Thaks again! – Lefkios Paikousis Apr 30 '20 at 13:32
  • 1
    Overall, I think the thing that stopped you was the non-standard evaluation of `.funs = list(~{})`, which I will admit, I don't understand fully. When it doesn't work the way I expect, I just go back to the classic `function(x){}` form. – Ian Campbell Apr 30 '20 at 13:36