0

I want to write a custom function that can take bare and "string" inputs, and can handle both functions with and without the formula interface.

custom function example

# setup
set.seed(123)
library(tidyverse)

# custom function
foo <- function(data, x, y) {
  # function without formula
  print(table(data %>% dplyr::pull({{ x }}), data %>% dplyr::pull({{ y }})))

  # function with formula
  print(
    broom::tidy(stats::t.test(
      formula = rlang::new_formula({{ rlang::ensym(y) }}, {{ rlang::ensym(x) }}),
      data = data
    ))
  )
}

bare

works for both functions with and without formula interface

foo(mtcars, am, cyl)
#>    
#>      4  6  8
#>   0  3  4 12
#>   1  8  3  2

#> # A tibble: 1 x 10
#>   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
#>      <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
#> 1     1.87      6.95      5.08      3.35 0.00246      25.9    0.724      3.02
#> # ... with 2 more variables: method <chr>, alternative <chr>

string

works for both functions with and without formula interface

foo(mtcars, "am", "cyl")
#>    
#>      4  6  8
#>   0  3  4 12
#>   1  8  3  2

#> # A tibble: 1 x 10
#>   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
#>      <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
#> 1     1.87      6.95      5.08      3.35 0.00246      25.9    0.724      3.02
#> # ... with 2 more variables: method <chr>, alternative <chr>

as colnames

works only for functions without the formula interface

foo(mtcars, colnames(mtcars)[9], colnames(mtcars)[2])
#>    
#>      4  6  8
#>   0  3  4 12
#>   1  8  3  2

#> Error: Only strings can be converted to symbols
#> Backtrace:
#>     x
#>  1. \-global::foo(mtcars, colnames(mtcars)[9], colnames(mtcars)[2])
#>  2.   +-base::print(...)
#>  3.   +-broom::tidy(...)
#>  4.   +-stats::t.test(...)
#>  5.   +-rlang::new_formula(...)
#>  6.   \-rlang::ensym(y)

How can I modify the original function so that it will work with all the above-mentioned ways of entering the inputs and for both kinds of functions used?

Indrajeet Patil
  • 4,673
  • 2
  • 20
  • 51
  • 1
    you're on the verge of being too flexible: `x <- 'am'; y <- 'cyl'; foo(mtcars, x, y)` – rawr Feb 05 '20 at 21:03
  • 3
    @rawr or even more confusing: `am <- 'disp'; cyl <- 'gear'; foo(mtcars, am, cyl)`. What's supposed to happen there? Are the parameter supposed to be evaluated to their string values or not? – MrFlick Feb 05 '20 at 21:07

2 Answers2

3

The nice philosophy of rlang is that you get to control when you want values to be evaluated via the !! and {{}} operators. You seem to want to make a function that takes strings, symbols, and (possibly evaluated) expressions all in the same parameter. Using symbols or bare strings is actually easy with ensym but also wanting to allow for code like colnames(mtcars)[9] that has to be evaulated before returning a string is the problem. This potentially can be quite confusing. For example, what's the behavior you expect when you run the following?

am <- 'disp'
cyl <- 'gear'
foo(mtcars, am, cyl)

You could write a helper function if you want to assume all "calls" should be evaluated but symbols and literals should not. Here's a "cleaner" function

clean_quo <- function(x) {
  if (rlang::quo_is_call(x)) {
    x <- rlang::eval_tidy(x)
  } else if (!rlang::quo_is_symbolic(x)) {
    x <- rlang::quo_get_expr(x)
  }
  if (is.character(x)) x <- rlang::sym(x)
  if (!rlang::is_quosure(x)) x <- rlang::new_quosure(x)
  x
}

and you could use that in your function with

foo <- function(data, x, y) {
  x <- clean_quo(rlang::enquo(x))
  y <- clean_quo(rlang::enquo(y))

  # function without formula
  print(table(data %>% dplyr::pull(!!x), data %>% dplyr::pull(!!y)))

  # function with formula
  print(
    broom::tidy(stats::t.test(
      formula = rlang::new_formula(rlang::quo_get_expr(y), rlang::quo_get_expr(x)),
      data = data
    ))
  )
}

Doing so will allow all these to return the same values

foo(mtcars, am, cyl)
foo(mtcars, "am", "cyl")
foo(mtcars, colnames(mtcars)[9], colnames(mtcars)[2])

But you are probably just delaying possible other problems. I would not recommend over-interpreting user intentions with this kind of code. That's why it's better to explicitly allow them to un-escape themselves. Perhaps provide two different versions of the function that can be used with parameter that require evaluation and those that do not.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Thanks. I am supporting all the different ways to enter arguments because a lot of users of my functions expect them to work. This is because of their prior experience with the `tidyverse` functions, which can indeed accept symbols, strings, and expressions. – Indrajeet Patil Feb 05 '20 at 22:27
3

I have to agree with @MrFlick and others about inherent ambiguity when mixing standard and non-standard evaluation. (I also pointed this out in your similar question from a while ago.)

However, one can argue that dplyr::select() works with symbols, strings and expressions of the form colnames(.)[.]. If you absolutely must have the same interface, then you can leverage tidyselect to resolve your inputs:

library( rlang )
library( tidyselect )

ttest <- function(data, x, y) {
  ## Identify locations of x and y in data, get column names as symbols
  s <- eval_select( expr(c({{x}},{{y}})), data ) %>% names %>% syms

  ## Use the corresponding symbols to build the formula by hand
  broom::tidy(stats::t.test(
    formula = new_formula( s[[2]], s[[1]] ),
    data = data
  ))
}

## All three now work
ttest( mtcars, am, cyl )
ttest( mtcars, "am", "cyl" )
ttest( mtcars, colnames(mtcars)[9], colnames(mtcars)[2] )
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • 1
    Yes, thanks for noticing that. This was very much inspired by `dplyr::select`! I expect users who have experience with tidyvese to also expect my functions will work the same way! And it would be nice if I could support that. – Indrajeet Patil Feb 05 '20 at 22:29