While doing some OLS regressions, I discovered that statsmodels.api.add_constant()
does the following:
if _is_using_pandas(data, None) or _is_recarray(data):
from statsmodels.tsa.tsatools import add_trend
return add_trend(data, trend='c', prepend=prepend, has_constant=has_constant)
If not, it treats data
as an ndarray and so you lose some contextual information (e.g. the column names which are the regressor variables names). When importing pandas from modin, the is_using_pandas()
above returns False
.
It is possible that statsmodels
need to add modin
as a supported option to their _is_using_pandas()
but for now, I'd like to do something like:
if is_using_modin_pandas(x):
from statsmodels.tsa.tsatools import add_trend
X = add_trend(x, trend='c', prepend=True, has_constant='skip')
else:
X = sm.add_constant(x)
How would one write is_using_modin_pandas()
?