0

I have a df like the following:

   DATE_MIN DATE_MAX
214 1994-06-29  2010-07-12
125 1969-10-26  2011-10-10
123 2013-07-02  2015-01-29
74  2006-01-05  2016-06-20

Columns are: DATE_MIN, DATE_MAX

I would like to plot a long vertical chart with as many horizontal lines as df rows.

  • With date range in x axis
  • With index in y axis, hence each line is horizontal, they don't cross each other.

So I want to show which rows have some time overlapping because the lines would have overlapping area (not crossing).

Super thanks in advance!

alexisdevarennes
  • 5,437
  • 4
  • 24
  • 38
miguelfg
  • 1,455
  • 2
  • 16
  • 21
  • Aren't you actually asking for a gantt? – rpanai Jan 24 '20 at 12:56
  • @rpanai it seems that the closest chart in matplotlib is 'hlines', it worked good enough to me, thanks – miguelfg Jan 24 '20 at 14:00
  • 1
    Does this answer your question? [How to plot stacked event duration (Gantt Charts) using Python Pandas?](https://stackoverflow.com/questions/31820578/how-to-plot-stacked-event-duration-gantt-charts-using-python-pandas) – miguelfg Jan 27 '20 at 12:39

2 Answers2

0

If you are happy to make a gantt and use plotly you could use this doc

import pandas as pd
from io import StringIO
import plotly.figure_factory as ff

txt="""TASK  DATE_MIN DATE_MAX
214 1994-06-29  2010-07-12
125 1969-10-26  2011-10-10
123 2013-07-02  2015-01-29
74  2006-01-05  2016-06-20
"""

df = pd.read_csv(StringIO(txt), delim_whitespace=True)

data = []
for i,row in df.iterrows():
    data.append(dict(Task=str(row["TASK"]),
                     Start=str(row["DATE_MIN"]),
                     Finish=str(row["DATE_MAX"])))

fig = ff.create_gantt(data)
fig.show()

enter image description here

rpanai
  • 12,515
  • 2
  • 42
  • 64
0

Thanks for the super quick answers so far. I actually found a good solution for me here

My code hence looks like this:

import matplotlib.pyplot as plt
import matplotlib.dates as dt

df = df.sort_values(by=['DATE_MIN', 'DATE_MAX'])
fig = plt.figure(figsize=(10, 50))
ax = fig.add_subplot(111)
ax = ax.xaxis_date()
ax = plt.hlines(df.ID, 
                dt.date2num(df.DATE_MIN), 
                dt.date2num(df.DATE_MAX))
miguelfg
  • 1,455
  • 2
  • 16
  • 21