I am trying to do time series analysis on my data of website pageviews:
import numpy as np
import pandas as pd
import pyflux as pf
from datetime import datetime
import matplotlib.pyplot as plt
%matplotlib inline
data = pd.read_csv('pageviews.csv')
data = data[:100]
print(data.head())
# data.index = data['timestamp'].values
plt.figure(figsize=(30,15))
plt.plot(data['timestamp'], data['event_count'])
plt.ylabel('Event Counts')
plt.title('Page Views')
I am following the tutorial
Amazingly, my plot of time series looks like https://i.stack.imgur.com/Ul9wp.jpg
Here is sample of my data:
timestamp event_count
0 2017-10-05T18:00:00Z 1691
1 2017-10-05T19:00:00Z 1436
2 2017-08-13T06:00:00Z 735
3 2017-08-13T07:00:00Z 706
4 2017-08-13T05:00:00Z 780
What is going wrong I can't understand?
The tutorial is here http://pyflux.readthedocs.io/en/latest/arima.html.