I tried searching everywhere, but couldn't find this: how can I run a diff-in-diff with fixed effects in Python?
I already know how to run a diff-in-diff. For instance, let's consider the njmin dataset. This dataset consider the minimum wage increase in New Jersey. First of all, sorry about the screenshots, I know it's not recommended, but I think it will be a facilitator in here. Here, we are talking about the minimum wage increase and its effects in the unemployment rate. More about the problem you can find in here. I ran a ols regression to see if d_nj, which is the result of the multiplication of d (after minimum wage increase) and nj (if in New Jersey), has any effect on fte, which is full-time equivalent employees. Basically, we want to know if the change in the minimum wage affected the unemployment.
import pandas as pd
import statsmodels.formula.api as smf
import statsmodels.api as sm
df = pd.read_csv('/njmin3.csv')
model = smf.ols(formula = "fte ~ d_nj + kfc + roys + wendys \
+ CO_OWNED + SOUTHJ + CENTRALJ + PA1", data = df).fit()
print(model.summary())
As you can see, we have a diff-in-diff model to see if the minimum wage increase in New Jersey had impact in the unemployment rate. d_nj was not significant.
Now, if I have many cities, many datapoints, and want to include fixed effects. What can I do?