I am getting very different results from python statsmodels.api.OLS() and R lm() run on the same data. The R results are about what I expected, in python not so much. I'm sure there's something really basic I've misunderstood... Any help much appreciated.
Python
import statsmodels.formula.api as smf
import pandas as pd
df = pd.DataFrame({'date': [1.5488064e+18, 1.5043968e+18],
'count': [15.0, 12.0]})
fit = smf.ols('count~date', data=df).fit()
new_data = pd.DataFrame({'date': [1.398816e+18, 1.337040e+18]})
new_data['count'] = (fit.predict(new_data))
print(new_data)
results in:
date count
0 1.398816e+18 12.387341
1 1.337040e+18 11.840278
R
df <- data.frame(date=c(1.5488064e+18, 1.5043968e+18),
count=c(15.0, 12.0))
fit <- lm(count~date, data=df)
new_data <- data.frame(date=c(1.398816e+18, 1.337040e+18))
new_data[['count']] <- predict(fit, new_data)
print(new_data)
results in
date count
1 1.398816e+18 4.8677043
2 1.337040e+18 0.6945525
seems similar to this and this but nothing in those questions is solving my situation.