Suppose I have the following toy data:
import pandas as pd
from linearmodels.panel import PanelOLS
y = pd.DataFrame(
index=[[1, 1, 1, 2, 2, 2], [1, 2, 3, 1, 2, 3]],
data=[70, 60, 50, 30, 33, 27],
columns=["y"],
)
y.index.set_names(["Entity", "Time"], inplace=True)
x = pd.DataFrame(
index=[[1, 1, 1, 2, 2, 2], [1, 2, 3, 1, 2, 3]],
data=[[100], [89], [62], [29], [49], [23]],
columns=["X"],
)
x.index.set_names(["Entity", "Time"], inplace=True)
And build a model using PanelOLS
with entity_effects=True
:
model_within = PanelOLS(dependent=y, exog=x, entity_effects=True).fit()
And then wanted to use the predict()
method to see how a new "entity" would be modelled. First creating a new entity with:
new_x = pd.DataFrame(
index=[[3, 3, 3], [1, 2, 3]],
data=[[40], [70], [33]],
columns=["X"],
)
new_x.index.set_names(["Entity", "Time"], inplace=True)
Then predicting with:
model_within.predict(new_x)
To get the following output:
predictions | ||
---|---|---|
Entity | Time | |
3 | 1 | 16.136230 |
2 | 28.238403 | |
3 | 13.312390 |
According to Wooldridge, 2012, pg 485, the within estimator is estimating:
Since this is modelling a change in expected y from the average of past y's for this entity, how should the predictions be interpreted? My intuition is that the prediction is saying: For this new entity, 3, in time period 1, given these X inputs, y at time 1 should be 16 units higher than it's average y across all time, for this entity. Is this interpretation correct? How might it be improved?