1

I need to be able to apply a transformation function to a series of columns based on a bespoke list of column names. Each of the column names has the same prefix and a numeric suffix, so I was hoping to find a simple way to transform each of the columns using those suffix within the tidyverse.

Here is some toy data.

rm(list = ls())
set.seed(1)
df <- data.frame(q1 = sample(1:5, 10, replace = T),
                 q2 = sample(1:3, 10, replace = T),
                 q3 = sample(1:6, 10, replace = T),
                 q4 = sample(1:5, 10, replace = T),
                 q5 = sample(1:5, 10, replace = T))

#    q1 q2 q3 q4 q5
# 1   1  1  5  4  4
# 2   4  1  1  1  1
# 3   1  2  1  4  1
# 4   2  2  6  3  4
# 5   5  2  5  2  1
# 6   3  2  5  2  2
# 7   2  3  2  4  3
# 8   3  1  2  4  2
# 9   3  3  6  4  2
# 10  1  1  1  2  5

Now say q1, q4, and q5 all require the same recode. Using the numerical suffix of each variable I can recode them using the following for-loop in base R and with the mapvalues function in plyr

vec1 <- c(1, 4, 5)
df1 <- df
for (i in vec1) {
  df1[,paste0("q",i)] <- plyr::mapvalues(df1[,paste0("q",i)], from = 1:5, to = seq(100,0,-25))
}
df1

#     q1 q2 q3  q4  q5
# 1  100  1  5  25  25
# 2   25  1  1 100 100
# 3  100  2  1  25 100
# 4   75  2  6  50  25
# 5    0  2  5  75 100
# 6   50  2  5  75  75
# 7   75  3  2  25  50
# 8   50  1  2  25  75
# 9   50  3  6  25  75
# 10 100  1  1  75   0

I can also recode a single column using dplyr quite easily.

df %>% mutate(q1 = dplyr::recode(q1, `1` = 100, `2` = 75, `3` = 50, `4` = 25, `5` = 0))

#     q1 q2 q3 q4 q5
# 1  100  1  5  4  4
# 2   25  1  1  1  1
# 3  100  2  1  4  1
# 4   75  2  6  3  4
# 5    0  2  5  2  1
# 6   50  2  5  2  2
# 7   75  3  2  4  3
# 8   50  1  2  4  2
# 9   50  3  6  4  2
# 10 100  1  1  2  5

But when I try to do it using a for-loop in dplyr I run into all sorts of issues. Based on this post I tried to using rlang::syms() and the !!! function

df2 <- df
for (i in 1:length(vec1)) {
  var <- rlang::syms(paste0("q", vec1[i]))
  df2 <- df2 %>% mutate(!!!var = dplyr::recode(!!!var, `1` = 100, `2` = 75, `3` = 50, `4` = 25, `5` = 0))
}

But it generates the error

Error: unexpected '=' in:
"  var <- rlang::syms(paste0("q", vec1[i]))
  df2 <- df2 %>% mutate(!!!var ="

Any advice? Doesn't have to be dplyr. I have a feeling purrr might hold some answers but I know almost nothing about it.

llewmills
  • 2,959
  • 3
  • 31
  • 58

1 Answers1

1

We can use mutate_at or across in new dplyr to apply the same function to multiple columns.

library(dplyr)
df %>%
  mutate(across(vec1, recode, `1` = 100, `2` = 75, `3` = 50, `4` = 25, `5` = 0))
  #mutate_at(vec1, recode, `1` = 100, `2` = 75, `3` = 50, `4` = 25, `5` = 0)

#    q1 q2 q3  q4  q5
#1  100  1  5  25  25
#2   25  1  1 100 100
#3  100  2  1  25 100
#4   75  2  6  50  25
#5    0  2  5  75 100
#6   50  2  5  75  75
#7   75  3  2  25  50
#8   50  1  2  25  75
#9   50  3  6  25  75
#10 100  1  1  75   0
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213