I would like to visualize CSV data as shown below, by a timeseries representation, using python's pandas module (see links below).
Sample data of df1:
TIMESTAMP eventid
0 2017-03-20 02:38:24 1
1 2017-03-21 05:59:41 1
2 2017-03-23 12:59:58 1
3 2017-03-24 01:00:07 1
4 2017-03-27 03:00:13 1
The 'eventid' column always contains the value of 1, and I am trying to show the sum of events for each day in the dataset. Is
pandas.Series.cumsum()
the correct function to use for this purpose?
script so far:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df1 = pd.read_csv('timestamp01.csv')
print df1.columns # u'TIMESTAMP', u'eventid'
# I: ts = pd.Series(df1['eventid'], index=df1['TIMESTAMP'])
# O: Blank plot
# I: ts = pd.Series(df1['eventid'], index=pd.date_range(df1['TIMESTAMP'], periods=1000))
# O: TypeError: Cannot convert input ... Name: TIMESTAMP, dtype: object] of type <class 'pandas.core.series.Series'> to Timestamp
# working test example:
# I: ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
# O: See first link below (first plot).
ts = ts.cumsum()
ts.plot()
plt.show()
Links I have tried to follow:
http://pandas.pydata.org/pandas-docs/stable/visualization.html
Aggregating timeseries from sensors
(above example has different values, as opposed to my 'eventid' data)
Any help is much appreciated.