2

Is there a way to make predictions from a fixest model on an observation that has an out-of-sample fixed effects level? I would like this prediction to be based on the weighted mean of the existing fixed effects levels in the training data. For the weights I would like to use the number of observations for each FE level.

Currently, I am re-estimating a model without fixed effects and use it for prediction when the full model yields a missing value. However, I am looking for a solution without re-estimating or updating the model, similar to using the na.fill argument in a plm model (see this Stackoverflow answer).

In the example below, the Product variable takes on integers from 1 to 20 in the training data, so the prediction of 21 yields a missing value:

library(tidyverse)
library(fixest)

# fit model
data(trade)
mod <- feols(log(Euros) ~ log(dist_km) | Product, trade)

# define new data
df <- tribble(
  ~dist_km, ~Product,
  140, 20, # in sample
  140, 21  # out of sample
)

# no prediction for FE level 21 that is not in the training data
predict(mod, newdata = df)
#> [1] 20.14376       NA

Created on 2023-03-08 with reprex v2.0.2

Using a model without fixed effects, the value that is currently missing would be replaced by 18.88489.

dufei
  • 2,166
  • 1
  • 7
  • 18

1 Answers1

0

I do not think that this is possible atm with the fixest package. You could do it manually, e.g.

oos <- fixef(mod)  |> purrr::map_dbl(function(x){
  # weighted mean
  sum(x)/length(x)
})
  

predict(mod, newdata = df) |> tidyr::replace_na(oos)
[1] 20.14376 28.84476
Julian
  • 6,586
  • 2
  • 9
  • 33
  • Thanks for the answer! For the weights I would like to use the number of observations for each FE level in the training data. I will add this to the original post. Also, I would not replace the missing value with the FE value only but also consider the remaining variables for the prediction. – dufei Mar 13 '23 at 17:11