piping dplyr mutate with unknown variable name

Question

I'm trying to use mutate from dplyr with dynamic variables names. I've found a couple of post on SO (here, here and here) which got me closer but not to a workable solution. I don't think much is missing, but I require your help for it.

Here is a reproducible example very similar to my problem. I have tables that have 2 fields, one of them called either AD or any other name. This field as to be a factor, but could be character or integer. My function need to convert to factor.

library(dplyr)

t1 <- data.frame(f1 = 1:4, AD = 1:4)
t2 <- data.frame(f1 = 1:4, FC = 1:4)

ff <- function(tt){

  # find the variable name
  if(any(colnames(tt)=="AD")){
    vv <- quo(AD)
  } else {
    vv <- colnames(tt) %>% .[.!="f1"]
    vv <- enquo(vv)
  }

  # make the mutate
  tt %>% mutate(!!quo_name(vv) := as.factor(!!vv))      
}

With the help of the link cited earlier, I manage to make the function work for table that have AD (using the quo, !! and := which were function I didn't know about before).

ff(tt=t1) %>% str
'data.frame':   4 obs. of  2 variables:
 $ f1: int  1 2 3 4
 $ AD: Factor w/ 4 levels "1","2","3","4": 1 2 3 4

This works well. but when I send a table with an unknown variable name:

ff(tt=t2) %>% str
'data.frame':   4 obs. of  2 variables:
 $ f1: int  1 2 3 4
 $ FC: Factor w/ 1 level "FC": 1 1 1 1

My FC is now wrong with only 1 factor being FC

I think the problem is in the way I set vv in the second option which give me the wrong env value:

quo(AD)
<quosure>
  expr: ^AD
  env:  global


vv <- colnames(tt) %>% .[.!="f1"]
enquo(vv)
<quosure>
  expr: ^"FC"
  env:  empty

Any idea how to fix my problem? I open to base R solution, however it as to be able to fit in a long piping procedure.

score 4 · Accepted Answer · answered Apr 20 '18 at 13:46

4

You don't need enquo there. That's for turning a value passed as a parameter into a quosure. Instead you need to turn a string into a symbol. For that you can use as.name() or rlang::sym()

ff <- function(tt){

  # find the variable name
  if(any(colnames(tt)=="AD")){
    vv <- quo(AD)
  } else {
    vv <- colnames(tt) %>% .[.!="f1"]
    vv <- as.name(vv)
  }

  # make the mutate
  tt %>% mutate(!!quo_name(vv) := as.factor(!!vv))      
}

answered Apr 20 '18 at 13:46

MrFlick

195,160
17
277
295

Thanks, it also seems I could use `as.name` in the `AD` part: ` vv <- as.name("AD")` Why does `quo`exist then, should I prefer one over the other? – Bastien Apr 20 '18 at 13:55
1

`quo()` creates a quosure which captures not only an expression/symbol but also an environment. In this particular case, you're not really interested in capturing an environment since you just want to evaluate a symbol in the context of the data in the mutate chain. If you had a more complicated expression than just a column name, you'd have to use something fancier than `as.name()`. – MrFlick Apr 20 '18 at 14:15

piping dplyr mutate with unknown variable name

1 Answers1