passing function argument to dplyr select

Question

To select a couple columns from a dataframe I can do

require(dplyr)
require(magrittr)

df <- data.frame(col1=c(1, 2, 3), col2=letters[1:3], col3=LETTERS[4:6])

df %>%
  select(col1, col2)

I want to write a function similar to

f <- function(data, firstCol, secondCol){
   data %>%
    select(substitute(firstCol), substitute(secondCol))
}

But running f(df, col1, col2) gives me the error

Error in select_vars(names(.data), ..., env = parent.frame()) : 
  (list) object cannot be coerced to type 'double'
Called from: (function () 
{
    .rs.breakOnError(TRUE)
})()

EDIT -- slightly less trivial example:

Suppose I wanted to do

mtcars %>%
  select(cyl, hp) %>%
  unique %>%
  group_by(cyl) %>%
  summarise(avgHP = mean(hp))

but with different datasets and different variable names. I could reuse the code and replace mtcars, cyl, and hp. But I'd rather wrap it all up in a function

Perhaps [**this post**](http://stackoverflow.com/questions/22005419/dplyr-without-hard-coding-the-variable-names) (with an answer by @hadley) is relevant? — Henrik, Apr 07 '14 at 18:27
Just curious. Did anybody solve the slightly less trivial example in the end? — tim, May 06 '15 at 17:01
@tim see http://cran.r-project.org/web/packages/dplyr/vignettes/nse.html quite a bit of things have changed since last year — kevinykuo, May 06 '15 at 17:03
@organicagave Thanks for pointing me in the right direction. — tim, May 06 '15 at 18:13

score 6 · Accepted Answer · answered Apr 10 '14 at 23:53

6

It's pretty simple in this case, since you can just use ...

f <- function(data, ...) {
  data %>% select(...)
}

f(df, col1, col2)

#>   col1 col2
#> 1    1    a
#> 2    2    b
#> 3    3    c

In the more general case, you have two options:

Wait until https://github.com/hadley/dplyr/issues/352 is closed
Construct the complete expression using substitute() and then eval()

answered Apr 10 '14 at 23:53

hadley

102,019
32
183
245

In fact, in this case you can even use: `f <- select` . – G. Grothendieck Apr 22 '14 at 22:59

score 4 · Answer 2 · answered Jul 05 '20 at 15:05

Since rlang version 0.4.0, the curly-curly {{ operator would be a better solution.

f <- function(data, firstCol, secondCol){
   data %>%
    select({{ firstCol }}, {{ secondCol }})
}

df <- data.frame(col1=c(1, 2, 3), col2=letters[1:3], col3=LETTERS[4:6])

df %>% f(col1, col2)

#   col1 col2
# 1    1    a
# 2    2    b
# 3    3    c

passing function argument to dplyr select

2 Answers2