I am having problems running OLS in Python after reading in Stata data. Below are my codes and error message
import pandas as pd # To read data
import numpy as np
import statsmodels.api as sm
gss = pd.read_stata("gssSample.dta", preserve_dtypes=False)
X = gss[['age', 'impinc' ]]
y = gss[['educ']]
X = sm.add_constant(X) # adding a constant
model = sm.OLS(y, X).fit()
print(model.summary())
The error message says:
ValueError: Pandas data cast to numpy dtype of object. Check input data with np.asarray(data).
So any thoughts how to run this simple OLS?