1

I would like to know how to transform existing variables within a function using mutate() in the new dplyr 1.0.0, but I am having trouble with the left hand side of the function. Here is my problem

library(dplyr)
library(forcats)

# toy data
df <- tibble(d_1_a = factor(c(letters[1:5],NA,NA)),
             d_1_b = factor(c(NA,NA,letters[1:5])))

Now, using the new dplyr programming syntax .data[[foo]], I can create a function that mutates an existing column into a new column, in this case applying the fct_explicit_na function to a column.

muteFunct <- function(d, i) {
  d %>% mutate(newFact = forcats::fct_explicit_na(.data[[i]]))
}

muteFunct(df, "d_1_a")

# output
#   d_1_a d_1_b newFact  
#   <fct> <fct> <fct>    
# 1 a     NA    a        
# 2 b     NA    b        
# 3 c     a     c        
# 4 d     b     d        
# 5 e     c     e        
# 6 NA    d     (Missing)
# 7 NA    e     (Missing)

That worked fine. Note the new column with the explicit NAs. But if I want to transform an existing column the fact that the left hand side also requires an argument passed in through the function causes problems (well causes me problems anyway). I have tried this several ways but here is one attempt, using the !! operator

muteFunct2 <- function(d, i) {
  d %>% mutate(!!.data[[i]] := forcats::fct_explicit_na(.data[[i]]))
}

muteFunct2(df, "d_1_a")

This generates the following error message

Error: The LHS of `:=` must be a string or a symbol
Run `rlang::last_error()` to see where the error occurred. 

Can anyone help me out?

Note: I am aware there are better ways to achieve the specific outcome (explicit NAs) in dplyr without creating a bespoke function, however, this specific example is just for illustrative purposes. I am more interested generally in how to pass arguments to the left side of the = sign in mutate()

llewmills
  • 2,959
  • 3
  • 31
  • 58

1 Answers1

3

In the lhs, we can pass string and do the assignment (:=). As i passed is the column name, just do the !! i :=

muteFunct2 <- function(d, i) {
   d %>%
     mutate(!! i:= forcats::fct_explicit_na(.data[[i]]))
 }

-testing

muteFunct2(df, "d_1_a")
# A tibble: 7 x 2
  d_1_a     d_1_b
  <fct>     <fct>
1 a         <NA> 
2 b         <NA> 
3 c         a    
4 d         b    
5 e         c    
6 (Missing) d    
7 (Missing) e    

For updating existing column, we can also use across. The advantage is that it can take more than one column as well

muteFunct2 <- function(d, i) {
   d %>%
     mutate(across(all_of(i),  ~ forcats::fct_explicit_na(.)))
 }

-tesing

muteFunct2(df, "d_1_a")
muteFunct2(df, names(df))
# A tibble: 7 x 2
  d_1_a     d_1_b    
  <fct>     <fct>    
1 a         (Missing)
2 b         (Missing)
3 c         a        
4 d         b        
5 e         c        
6 (Missing) d        
7 (Missing) e       
akrun
  • 874,273
  • 37
  • 540
  • 662