0

My goal is to transfer the subset argument to the subset function within a second function. I am able to make it work (first example below), but not when using the ellipsis to pass a named argument.

# WORKS
subset2 <- function(data, subset) {
  x <- substitute(subset)
  modified_data <- model.frame(~., data = data, subset = eval(x))
  return(modified_data)
}
subset2(mtcars, subset = cyl == 6)
#>                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> Merc 280       19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> Merc 280C      17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6

# Doesn't work
subset2 <- function(data, ...) {
  dots <- list(...)
  subset <- dots$subset
  x <- substitute(subset)
  modified_data <- model.frame(~., data = data, subset = eval(x))
  return(modified_data)
}
subset2(mtcars, subset = cyl == 6)
#> Error in eval(expr, envir, enclos): object 'cyl' not found

Created on 2023-07-08 with reprex v2.0.2

Could someone shed light on why using the ellipsis breaks this example, and what can be done to fix it?

Related threads: 1, 2, 3. I also read http://adv-r.had.co.nz/Computing-on-the-language.html

rempsyc
  • 785
  • 5
  • 24
  • 1
    The first `subset2` doesn't really "work" - at least not robustly. It breaks if `data` has a variable named `x`. If you want to learn how to do this correctly, then consider how `stats::lm` passes its arguments to `stats::model.frame`. – Mikael Jagan Jul 08 '23 at 21:51

1 Answers1

2

You're probably overcomplicating things. The following should work for you:

subset2 <- function(data, ...) {
  x <- substitute(...)
  model.frame(~., data = data, subset = eval(x, envir = data))
}

Testing, we have:

subset2(mtcars, subset = cyl == 6)
#>                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> Merc 280       19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> Merc 280C      17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> Ferrari Dino   19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6

Created on 2023-07-08 with reprex v2.0.2

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thank you, that works, I've accepted your answer. That said, I failed to provide sufficient context in my question. I am modifying existing code in a Pull Request, and the authors were using the `dots <- list(...)` strategy to easily refer to other arguments later, and it is this part that breaks the subset argument. I would have wished to find a way to make both compatible, but I understand from this answer that this might not be possible. I might just have to rewrite a lot more code than I wanted should I go the way of passing `...` without defining it as a list at the top. – rempsyc Jul 09 '23 at 14:46
  • 1
    @rempsyc the expression `dots <- list(...)` _evaluates_ the arguments in the dots, so it is no good for non-standard evaluation. You have a couple of options if you don't want to touch the line `dots <- list(...)`. You could `attach(data)` in the line before and `detach(data)` in the line after, or you could merge the data columns into the parent frame to make them accessible. If you are "allowed" to change this line, then you could use `match.call()` to get the unevaluated arguments in the dots. – Allan Cameron Jul 09 '23 at 16:03