tl-dr;
for app in endog:
min_nonzero = series[series[app] > 0].min()[0]
series.loc[series[app] == 0, app] = min_nonzero
series[app + '_log_diff'] = np.log(series[app]).diff()
series = series.replace([np.inf, -np.inf], np.nan).dropna()
how to invert that for plotting?
full text
I'm having trouble with inverting my log transposition to remove stationarity. Here's the transpose:
series = u[columns].copy()
endogdiffs = []
for app in endog:
min_nonzero = series[series[app] > 0].min()[0]
series.loc[series[app] == 0, app] = min_nonzero
series[app + '_log'] = np.log(series[app])
series[app + '_log_diff'] = series[app + '_log'].diff()
endogdiffs.append(app + '_log_diff')
series = series.replace([np.inf, -np.inf], np.nan).dropna()
So then I am modeling app_log_diff's. My series is split into train and test groups, and the predictions are loaded back into a DF called y.
As I understand it, .diff() is inverted by .cumsum(). that gives me logs. .log() is inverted by .exp()
On output, I would think I should plot like:
plot the output
for i, app in enumerate(endog):
plt.plot(np.exp(train[app + '_log_diff'].append(y[app + '_log_diff']).cumsum()), color=[(i/10)+0.5, (i/10)+0.5, (i/10)+0.5])
plt.plot(np.exp(train[app + '_log_diff'].append(test[app + '_log_diff']).cumsum()), color=appColors[i])
But -- my initial values (all of them, not just the endogenous) are between 0-1. My output values there are about 1-50-something or 60-some for the y-predictions.
How do I invert the transform?
detail on the prediction section:
train and run the model
train, test = series[:size], series[size:size+(28*4*24)]
train = train.loc[:, (train != train.iloc[0]).any()] # https://stackoverflow.com/questions/20209600/panda-dataframe-remove-constant-column
test = test.loc[:, (test != test.iloc[0]).any()]
#print(train.var(), X.info())
# train autoregression
model = VARMAX(train[endogdiffs], exog=train[exog])
model_fit = model.fit(model='cg')
#print(model_fit.mle_retvals)
model_fit.plot_diagnostics()
##window = model_fit.k_ar
coef = model_fit.params
predictions = pd.DataFrame()
predictions = model_fit.forecast(steps=len(test), exog=test[exog])
y = predictions.copy()