-1

I'm not sure why it's doing such a crappy job. Here's the set of 189 data points I was hoping to get smoothed. Why is it lagging so much?

y = data
x = range(len(y))

tck, _ = splprep([x,y])
x2, y2 = splev(np.linspace(0,1,len(y)), tck)

plt.plot(y, 'b')
plt.plot(y2, 'g')
plt.show()

enter image description here

Raksha
  • 1,572
  • 2
  • 28
  • 53
  • gah ... nevermind ... dumb question ... for some reason, I thought since the x values are evenly distributed, it wouldn't matter if I didn't include them in the `plt.plot`, but it totally matters X.X – Raksha Aug 14 '18 at 21:19
  • 1
    Raksha, try using `Savitzky-Golay filter`. Here is a sample: https://stackoverflow.com/questions/20618804/how-to-smooth-a-curve-in-the-right-way – Sheldore Aug 14 '18 at 21:46
  • @Bazingaa ooo, how nice, thank you :) – Raksha Aug 14 '18 at 22:35

1 Answers1

0

Smoothing is a fairly common problem in time series analysis. Have you tried out exponential smoothing? The package StatsModels has a lot of callable smoothing functions.

from statsmodels.tsa.api import ExponentialSmoothing, SimpleExpSmoothing, Holt
y_hat_avg = test.copy()
fit2 = SimpleExpSmoothing(np.asarray(train['Count'])).fit(smoothing_level=0.6,optimized=False)
y_hat_avg['SES'] = fit2.forecast(len(test))
plt.figure(figsize=(16,8))
plt.plot(train['Count'], label='Train')
plt.plot(test['Count'], label='Test')
plt.plot(y_hat_avg['SES'], label='SES')
plt.legend(loc='best')
plt.show()