I want to run a regression in statsmodels that uses categorical variables and clustered standard errors.
I have a dataset with columns institution, treatment, year, and enrollment. Treatment is a dummy, institution is a string, and the others are numbers. I've made sure to drop any null values.
df.dropna()
reg_model = smf.ols("enroll ~ treatment + C(year) + C(institution)", df)
.fit(cov_type='cluster', cov_kwds={'groups': df['institution']})
I'm getting the following:
ValueError: The weights and list don't have the same length.
Is there a way to fix this so my standard errors cluster?