I'm currently trying to set up a jupyter notebook that has some plotting functionality. For the latter, I want to test plotly, the data is coming from a PostgreSQL server. I fetch the data directly into a pandas dataframe using df = pd.read_sql(query.statement, query.session.bind)
.
Now, I want to plot multiple time series into one single plot. So x-axis is a series of datetime objects (df['timestamp']
) and y are floats (df['data']
). Plotting two datasets from different data frames works like this:
import pandas as pd
import plotly.graph_objs as go
import plotly.offline as py
trace1 = go.Scatter(x = df1['timestamp'],y = df1['data'])
trace2 = go.Scatter(x = df2['timestamp'],y = df2['data'])
data = [trace1, trace2]
fig = dict(data=data)
py.iplot(fig)
The data represents two measurement time series. The two datasets are recorded on different days, but I'm interested in a temporal comparison. So my idea was to remove a temporal bias (first timestamp df['timestamp'][0]
or earliest timestamp min(df['timestamp'])
) by subtraction:
df['timestamp'] = df['timestamp'] - df['timestamp'][0]
However, this results in a series of pandas._libs.tslib.Timedelta objects. Plotting multiple timedelta series in one plot works with matplotlib, but not in plotly. For some reason, the two graphs are weirdly concatenated (see picture, if available), although they should appear on the same timescale. Any ideas?