I would like to take the best of this and this questions. Namely, I have a DataFrame
that contains the test name, date of execution and outcome. And I want to showcase how the percentage of failed cases decreased over time.
My data looks like this:
TestName;Date;IsPassed
test1;12/8/2016 9:44:30 PM;0
test1;12/8/2016 9:39:00 PM;0
test1;12/8/2016 9:38:29 PM;1
test1;12/8/2016 9:38:27 PM;1
test2;12/8/2016 5:05:02 AM;1
test3;12/7/2016 8:58:36 PM;0
test3;12/7/2016 8:57:19 PM;1
test3;12/7/2016 8:56:15 PM;1
test4;12/5/2016 6:50:49 PM;0
test4;12/5/2016 6:49:50 PM;0
test4;12/5/2016 3:23:09 AM;1
test4;12/4/2016 11:51:29 PM;1
And I was using this code to plot the cases separately:
fig, ax = plt.subplots()
passed = tests[tests.IsPassed == 1]
failed = tests[tests.IsPassed == 0]
passed_dates = mdates.date2num(passed.Date.astype(datetime))
failed_dates = mdates.date2num(failed.Date.astype(datetime))
ax.hist(passed_dates, bins=10, color='g')
ax.hist(failed_dates, bins=10, color='r')
ax.xaxis.set_major_locator(mdates.AutoDateLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d.%m.%y'))
plt.show()
But now I would like to
- Divide the time span into a configurable amount of buckets
- Count the amount of test runs per bucket (without for loops, as there is a lot of entries in the dataframe)
- Plot either a 100% area chart or the stacked histogram per each bucket, so that the amount from step 2 is 100%
The problem for me right now is that the perfectly working solution with the hist()
takes care of summing up automatically, and I don't see a way to pass the Y axis to it.
Update
Here is what I'd like to accomplish (taken from another source):