0

I have some variables that should be coded 0-3 or NA but have been coded 1-4 with "." for NA. I want to recode them using mutate() and case_when() in a for loop but after running the loop my data don't change. No errors appear.

Before the transformation the data look like this:

tabyl(df$epds_1)

# df$epds_1  n     
# .          6 
# 1        472 
# 2         93 
# 3         45 
# 4         23 
# <NA>      10 

I try to mutate ten variables using the following:

    for (v in c("epds_1", "epds_2", "epds_3", "epds_4", "epds_5", "epds_6",
                "epds_7", "epds_8", "epds_9","epds_10")) {
      df <- df %>% mutate(`v` = case_when(`v` == "." ~ NA_real_, 
                                          `v` == "1" ~ 0, 
                                          `v` == "2" ~ 1, 
                                          `v` == "3" ~ 2, 
                                          `v` == "4" ~ 3))
      }

Hoping for the following table to result (the count of 16 combining the top and bottom rows of the original table):

tabyl(df$epds_1)
# df$epds_1  n     
# 0        472 
# 1         93 
# 2         45 
# 3         23 
# <NA>      16 
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. To mutate multiple columns, you'll want to use `across()` instead of a loop. – MrFlick May 15 '23 at 14:05
  • 1
    @NicChr has given you a working solution. As to the "why" your code does not work, you are a victim of non-standard evaluation [NSE](https://dplyr.tidyverse.org/articles/programming.html): the tidyverse (generally) expects you to provide the unquoted name of a data frame column, not a character variable containing the name of the column. – Limey May 15 '23 at 14:29

1 Answers1

4

You can use across() to apply a function to several variables.

vars <- c("epds_1", "epds_2", "epds_3", "epds_4", "epds_5", "epds_6",
          "epds_7", "epds_8", "epds_9","epds_10")
df %>% mutate(across(all_of(vars), ~ case_when(.x == "." ~ NA_real_, 
                              .x == "1" ~ 0, 
                              .x == "2" ~ 1, 
                              .x == "3" ~ 2, 
                              .x == "4" ~ 3)))

As for why the loop isn't working.. to get it to work you would change your code to something like this.

for (v in c("epds_1", "epds_2", "epds_3", "epds_4", "epds_5", "epds_6",
            "epds_7", "epds_8", "epds_9","epds_10")) {
  df <- df %>% mutate(!!v := case_when(.data[[v]] == "." ~ NA_real_, 
                                      .data[[v]] == "1" ~ 0, 
                                      .data[[v]] == "2" ~ 1, 
                                      .data[[v]] == "3" ~ 2, 
                                      .data[[v]] == "4" ~ 3))
}

I would advise against this though and utilise dplyr's across as shown in my first example.

NicChr
  • 858
  • 1
  • 9