2

I've noticed something very strange while doing some regression analysis. Essentially, when I estimate a regression independently and that same regression within a purrr::map function and extract the element, I get that these two objects are not identical. My question is why this is the case or IF this SHOULD be the case.

The main reason I ask this is because some packages are having issues pulling information from estimations that are extracted from purrr::map, but not when I estimate them individually. Here is a small example with some nonsensical regressions:

library(fixest)
library(tidyverse)

## creating a formula for a regression example
formula <- as.formula(paste0(
  "mpg", "~",
  paste("cyl", collapse = "+"),
  paste("|"), paste(c("gear", "carb"), collapse = "+")))

## estimating the regression and saying
mtcars_formula <- feols(formula, cluster = "gear", data = mtcars)

## estimating the same regression twice, but using map
mtcars_list_map <- map(list("gear", "gear"), ~ feols(formula, cluster = ., data = mtcars))

## extracting the first element of the list
is_identical_1 <- mtcars_list_map %>% 
  pluck(1)


## THESE ARE NOT IDENTIAL
identical(mtcars_formula, is_identical_1)

I am tagging this with fixest package as well, only because this may be package specific...

mikeytop
  • 150
  • 9
  • can you replace identical with `all.equal()` and share those results – Mike Apr 25 '22 at 20:30
  • After replacing `identical()` with `all.equal()`, I got the following output: "Component “call”: target, current do not match when deparsed" – mikeytop Apr 25 '22 at 20:33
  • does this help answer the question or are the models themselves giving different results :https://stackoverflow.com/questions/38407793/the-difference-between-two-lm-functions-and-their-outputs – Mike Apr 25 '22 at 20:36
  • Not sure this helps answer the question, but I may be misunderstanding the solution. The models themselves are giving the same output when printed, but there is something strange going on with their underlying structure. Note that when using `fwildclusterboot::boottest(mtcars_list_map[[1]], param = "cyl", clustid = "gear", B = 9999)` there is an error while `fwildclusterboot::boottest(mtcars_formula, param = "cyl", clustid = "gear", B = 9999)` works just fine. – mikeytop Apr 25 '22 at 20:46

1 Answers1

1

The differences largely come down to differences in environment. For example, the third element of these lists (i.e. of mtcars_formula and is_identical_1) is the formula mpg~cyl (and in fact mtcars_formula[[3]] == is_identical_1[[3]] will return TRUE. However, you will see that these are associated with differing environments.

> mtcars_formula[[3]] == is_identical_1[[3]]
[1] TRUE
> environment(mtcars_formula[[3]])
<environment: 0x560a2490ef40>
> environment(is_identical_1[[3]])
<environment: 0x560a2554d810>

Whether or not you consider these differences "trivial" or not depends on your use case, but you can check the differences like this:

differences =list()
for(i in 1:length(mtcars_formula)) {
  if(!identical(mtcars_formula[[i]], is_identical_1[[i]])) {
    differences[[names(mtcars_formula)[i]]] = list(mtcars_formula[[i]], is_identical_1[[i]])
  }
}

One element that is indeed different is the reported call (the 4th element)

> mtcars_formula[[4]] == is_identical_1[[4]]
[1] FALSE
> c(mtcars_formula[[4]], is_identical_1[[4]])
[[1]]
feols(fml = formula, data = mtcars, cluster = "gear")

[[2]]
feols(fml = formula, data = mtcars, cluster = .)

This may have something to do with the error you report in the comments above, associated with fwildclusterboot::boottest(). Note that the call from the object created using map() indicates the cluster=., instead of `cluster="gear".

One way to get around this would be to do something like this:

mtcars_list_map <- map(list("gear", "gear"), function(x) {
  # create the model
  model = feols(formula, cluster = x, data = mtcars)
  # manipulate the call object
  model$call$cluster=x
  # return the model
  model
})


langtang
  • 22,248
  • 1
  • 12
  • 27
  • Ah I see. Is there a way to make these `call`s equivalent? For instance, is it possible to use map while retaining equivalence across `call`? – mikeytop Apr 25 '22 at 20:55
  • see my edit, where I've added an option to manipulate the call, before sending the result of `map` – langtang Apr 25 '22 at 21:21