I am doing time series analysis using statsmodels in Python. I am getting an error while using ARIMA method. I can select the parameter for order and seasonal_order. But I am getting an error in the 4th line which is saying
AttributeError: 'DataFrame' object has no attribute 'date'.
I have 'date' and 'value' attribute in my dataframe. But when I am printing my datacolumns, it is showing only "value" column only. Here is my code:
y_hat_avg = test_data.copy()
mod = sm.tsa.statespace.SARIMAX(train_data.value, order=(1, 1, 0), seasonal_order=(1, 0, 0, 12)
, enforce_stationarity=False, enforce_invertibility=False)
fit4 = mod.fit()
y_hat_avg['SARIMA'] = fit1.predict(start=test_data.date.iloc[0], end=test_data.date.iloc[-1], dynamic=True)
plt.figure(figsize=(16, 8))
plt.plot(train_data['value'], label='Train Data')
plt.plot(test_data['value'], label='Test Data')
plt.plot(y_hat_avg['SARIMA'], label='SARIMA')
plt.legend(loc='best')
plt.show()
The error is occurring in this line:
y_hat_avg['SARIMA'] = fit1.predict(start=test_data.date.iloc[0], end=test_data.date.iloc[-1], dynamic=True)
While I am printing only the column, it is showing only value column like this
print(test_data.columns)
>Index(['value'], dtype='object')
But if I print the dataframe head, it shows unwanted extra space like this:
print(df.head())
value
date
1996-12-31 0.927377
1997-12-31 0.927546
1998-12-31 -0.359870
1999-12-31 0.537907
2000-12-31 1.281655
Maybe the error is because of the extra space, but I am not sure.