1

I'm doing a function pasting a formula and then returning a feols result. But, I get a c at the beginning. How can I solve this?

library(dplyr)
library(fixest)

data(base_did)
base_did = base_did %>% mutate(D = 5*rnorm(1080),
                               x2 = 10*rnorm(1080),
                               rand_wei = abs(rnorm(1080)))

f <- function(data, arg=NULL){
  
  arg = enexpr(arg)
  
  if (length(arg) == 0) {
    formula = "D ~ 1"
  } 
  else {
    formula = paste(arg, collapse = " + ")
    formula = paste("D ~ ", formula, sep = "")
  }
  
  formula = paste(formula, " | id + period", sep = "")
  denom.lm <- feols(as.formula(formula), data = data, 
                    weights = abs(data$rand_wei))
  
  return(denom.lm)
}

f(base_did, arg = c(x1,x2))

#Error in feols(as.formula(formula), data = data, weights = abs(data$rand_wei)) : 
#  Evaluation of the right-hand-side of the formula raises an error: 
#  In NULL: Evaluation of .Primitive("c") returns an object of length 1
#while the data set has 1080 rows.

If I return(formula) at the end. I get [1] "D ~ c + x1 + x2 | id + period".

But I need only D ~ x1 + x2 | id + period.

2 Answers2

2

Perhaps one option to make your function work would be to pass the arguments via ... so that c is not needed and which would prevent the c to be added to your formula. To make this work you also have switch to enexprs inside your function.

Note: I slightly adjusted your function for the reprex to return just the formula.

library(dplyr, warn = FALSE)
library(fixest)

data(base_did)

base_did = base_did %>% mutate(D = 5*rnorm(1080),
                               x2 = 10*rnorm(1080),
                               rand_wei = abs(rnorm(1080)))

f <- function(data, ...){
  arg = enexprs(...)
  
  if (length(arg) == 0) {
    formula = "D ~ 1"
  } 
  else {
    formula = paste(arg, collapse = " + ")
    formula = paste("D ~ ", formula, sep = "")
  }
  
  formula = paste(formula, " | id + period", sep = "")
  
  as.formula(formula)
}


f(base_did, x1, x2)
#> D ~ x1 + x2 | id + period
#> <environment: 0x7fe8f3567618>

f(base_did)
#> D ~ 1 | id + period
#> <environment: 0x7fe8f366f848>

UPDATE There is probably a better approach but after some research a possible option would be:

Note: When passing multiple arguments via c enexpr will return a call object which behaves like a list and where the first element contains the function name, i.e. c. That's why you get the c added to your formula.

f <- function(data, arg = NULL) {
  arg <- enexpr(arg)
  
  if (length(arg) == 0) {
    formula <- "D ~ 1"
  } else {
    if (length(arg) > 1) arg <- vapply(as.list(arg[-1]), rlang::as_string, FUN.VALUE = character(1))
    
    formula <- paste(arg, collapse = " + ")
    formula <- paste("D ~ ", formula, sep = "")
  }

  formula <- paste(formula, " | id + period", sep = "")

  as.formula(formula)
}


f(base_did, c(x1, x2))
#> D ~ x1 + x2 | id + period
#> <environment: 0x7fa763431388>

f(base_did, x1)
#> D ~ x1 | id + period
#> <environment: 0x7fa763538c40>

f(base_did)
#> D ~ 1 | id + period
#> <environment: 0x7fa765e22028>
stefan
  • 90,330
  • 6
  • 25
  • 51
  • Ty @stefan for your answer, Is it possible to do it withot using`...` because I would like to create an argument `arg1` and then `arg2` (e.g. `arg1`: exogenous variables; `arg2`: control variables). – user20168262 Oct 06 '22 at 04:18
  • the funtion that I need should be `f(data, arg1=c(x1,x2), arg2 = c(crt1,crt2)` – user20168262 Oct 06 '22 at 04:34
  • See my update. I came up with a possible and working solution but am sure that there is a "cleaner" solution. – stefan Oct 06 '22 at 07:00
  • ty so much !!! Can u explain to me pls how works the line `vapply(...)` and the arguments inside `vapply`. Is it any difference between my [last post](https://stackoverflow.com/a/73963297/20168262) using `purrr::map_chr`. – user20168262 Oct 06 '22 at 22:12
  • Nop. You could also use `purrr::map_chr`. `vapply` with `FUN.VALUE=charcter(1)` is just the base R version and means to return a character vector. – stefan Oct 06 '22 at 22:18
1

You can tremendously simplify your function if you use fixest's built-in formula manipulation tools (see here). In particular the dot-square-bracket operator:

library(fixest)
data(base_did)
n = 1080
base = within(base_did, {
    D = 5 * rnorm(n)
    x2 = 10 *rnorm(n)
    rand_wei = abs(rnorm(n))
    })

f = function(data, ctrl = "1"){
    feols(D ~ .[ctrl] | id + period,
          data = data, weights = ~abs(rand_wei))
}

est1 = f(base)
est2 = f(base, ~x1)           # with a formula
est3 = f(base, c("x1", "x2")) # with a character vector

etable(est1, est2, est3)
#>                     est1            est2             est3
#> Dependent Var.:        D               D                D
#>                                                          
#> x1                       0.0816 (0.0619)  0.0791 (0.0618)
#> x2                                       -0.0157 (0.0186)
#> Fixed-Effects:  -------- --------------- ----------------
#> id                   Yes             Yes              Yes
#> period               Yes             Yes              Yes
#> _______________ ________ _______________ ________________
#> S.E. type             --          by: id           by: id
#> Observations       1,080           1,080            1,080
#> R2               0.12810         0.13005          0.13094
#> Within R2             --         0.00224          0.00326
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

EDIT: note that the formula thing (est2) only works with version >= 0.11.0.

Laurent Bergé
  • 1,292
  • 6
  • 8