I have a feeling there is a very simple way of doing this. I'm trying to plot a timeline of a tasks running on an an environment, incl. two plots on the same diagram:
- the task run-times as a
broken_barh
- an overall load curve based on the aggregate of tasks on each time-point (or a histogram), let's say with lower opacity or a line.
In the example there were 6 tasks running (A-F), for various lengths, with different start times. They are plotted exactly as I need (1/), in a gant-like chart, time on the X axis.
import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib as mpl
from matplotlib import pyplot as plt
cols=['ID','From','To']
df = pd.DataFrame([['A', 736758.993, 736758.995], ['B', 736758.995, 736758.998],
['C', 736758.994, 736758.996], ['D', 736758.996, 736758.997],
['E', 736758.996, 736758.997], ['F', 736758.995, 736758.996]],
columns=cols)
df['Diff'] = df['To']-df['From']
fig,ax=plt.subplots()
for i, slice in df.iterrows():
values = [[slice['From'], slice['Diff']]]
ax.broken_barh((values), (i-0.4,0.8), color=np.random.rand(3))
ax.xaxis_date()
To this I would like to add 2/ a curve, showing the active task count at each time (1 between 23:51-23:52, 2 for 23:52-53 etc., peaking around 23:54)
The problem with this is that I cannot just draw a histogram of the start times, since the different task overlap in time. Do you know a decent way to create such histogram?