2

Suppose I have the following toy data:

import pandas as pd
from linearmodels.panel import PanelOLS

y = pd.DataFrame(
    index=[[1, 1, 1, 2, 2, 2], [1, 2, 3, 1, 2, 3]],
    data=[70, 60, 50, 30, 33, 27],
    columns=["y"],
)
y.index.set_names(["Entity", "Time"], inplace=True)

x = pd.DataFrame(
    index=[[1, 1, 1, 2, 2, 2], [1, 2, 3, 1, 2, 3]],
    data=[[100], [89], [62], [29], [49], [23]],
    columns=["X"],
)
x.index.set_names(["Entity", "Time"], inplace=True)

And build a model using PanelOLS with entity_effects=True:

model_within = PanelOLS(dependent=y, exog=x, entity_effects=True).fit()

And then wanted to use the predict() method to see how a new "entity" would be modelled. First creating a new entity with:

new_x = pd.DataFrame(
    index=[[3, 3, 3], [1, 2, 3]],
    data=[[40], [70], [33]],
    columns=["X"],
)
new_x.index.set_names(["Entity", "Time"], inplace=True)

Then predicting with:

model_within.predict(new_x)

To get the following output:

predictions
Entity Time
3 1 16.136230
2 28.238403
3 13.312390

According to Wooldridge, 2012, pg 485, the within estimator is estimating:

enter image description here

Since this is modelling a change in expected y from the average of past y's for this entity, how should the predictions be interpreted? My intuition is that the prediction is saying: For this new entity, 3, in time period 1, given these X inputs, y at time 1 should be 16 units higher than it's average y across all time, for this entity. Is this interpretation correct? How might it be improved?

linearmodels .predict() documentation

Chris
  • 199
  • 9

1 Answers1

0

Posting results from seeking clarification through the repo: https://github.com/bashtage/linearmodels/issues/465

"The model is always Y=XB + epsilon + (eta_t ) + (nu_i ). The effects are treated as errors, and so when you predict you get new_x @ params and so the entity effects are not used."

So the predictions are for actual values of y, not time-demeaned predictions. However, to achieve time-demeaned predictions, one can create the same model using data that has first been time-demeaned, and pass in new time-demeaned data to predict on.

Chris
  • 199
  • 9