0

I am trying to create a Gantt chart in Python. Some of the tasks that I have to include in the chart have a duration of 0 days, meaning they have to be completed on the same day.

I've tried this code which I've found online that creates a basic Gantt chart with plotly:

df = pd.DataFrame([
    dict(Task="1", Start='2023-03-15', End='2023-03-15'),
    dict(Task="2", Start='2023-03-03', End='2023-03-10'),
    dict(Task="3", Start='2023-03-10', End='2023-03-15'),
])

print(df)

fig = px.timeline(df, x_start="Start", x_end="End", y="Task")
fig.update_yaxes(autorange="reversed") 
fig.show()

It works fine for tasks that have a duration of at least 1 day (like Task 2 and 3). However, tasks that have to be completed on the same day, like Task 1 in the example above, are not displayed in the Gantt chart after plotting it. The resulting chart only contains Task 2 and 3. The space next to the label of Task 1 stays empty.

Is there a way to display Task 1 (and other tasks that have to be completed on the same day) in the same Gantt chart as Task 2 and 3?

The Gantt chart doesn't have to be necessarily created with Plotly. Could be also with Matplotlib. Whatever works best and is the easiest most useful option.

Grateful for any help!!

lunamaria
  • 3
  • 1

2 Answers2

0

The example below provides similar functionality using matplotlib. It is adapted from the similar case at https://stackoverflow.com/a/76836805/21896093 .

When there's a task that has a duration of 0 days, a small duration is assigned (0.1 days) so that it shows up. You can adjust it as desired.

Output:

enter image description here

import pandas as pd
from matplotlib import patches
import matplotlib.pyplot as plt
import numpy as np

import matplotlib.dates as mdates

#
# Example data
#

#Original data
df = pd.DataFrame(
    {'Task': ['1', '2', '3'],
     'Start': ['2023-03-15', '2023-03-03', '2023-03-10'],
     'End': ['2023-03-15', '2023-03-10', '2023-03-15'],
     }
)

#Conver to datetime, as we'll do some simple arithmetic between dates
for date_col in ['Start', 'End']:
    df[date_col] = pd.to_datetime(df[date_col], format='%Y-%m-%d')
df

#
# Create plot
#
height = 0.9

f, ax = plt.subplots(figsize=(10, 6))
for idx in range(len(df)):
    y0 = (idx + 1) - height / 2
    x0 = df.iloc[idx].Start
    width = df.iloc[idx].End - x0
    if not width:
         width = pd.Timedelta(days=0.1)
    ax.add_patch( patches.Rectangle((x0, y0), width, height) )
    ax.hlines(y0 + height / 2,
              xmin=df.Start.min(),
              xmax=x0,
              color='k', linestyles=':', linewidth=0.5)

#DateFormatter required as we're building the plot using patches,
#rather than supplying entire series    
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.set_xticklabels(ax.get_xticklabels(), rotation=30)

ax.set_xlabel('Date')
ax.set_ylabel('Task')
ax.set_yticks(range(1, len(df) + 1))
ax.set_yticklabels(df.Task)
plt.show()

Update Version with segmented bars, as per request in comments.

enter image description here

import pandas as pd
from matplotlib import patches
import matplotlib.pyplot as plt
import numpy as np

import matplotlib.dates as mdates

#
# Example data
#

#Original data
df = pd.DataFrame(
    {'Task': ['1', '2', '3'],
     'Start': ['2023-03-15', '2023-03-03', '2023-03-10'],
     'End': ['2023-03-15', '2023-03-10', '2023-03-15'],
     }
)

#Conver to datetime, as we'll do some simple arithmetic between dates
for date_col in ['Start', 'End']:
    df[date_col] = pd.to_datetime(df[date_col], format='%Y-%m-%d')
df

#
# Create plot
#
height = 0.9
zero_width = pd.Timedelta(days=0.1)
segmentation_width = pd.Timedelta(days=1)
gap_between_days = pd.Timedelta(days=0.05)
one_day = pd.Timedelta(days=1)

f, ax = plt.subplots(figsize=(10, 6))
for idx in range(len(df)):
    y0 = (idx + 1) - height / 2
    x0 = df.iloc[idx].Start
    width = df.iloc[idx].End - x0

    if not width:
         width = pd.Timedelta(days=0.1)
    
    n_days = width // segmentation_width
    days_remainder = width % segmentation_width
    
    for day in range(n_days):
        day_td = pd.Timedelta(days=day)
        ax.add_patch( patches.Rectangle((x0 + day_td, y0),
                                        one_day - gap_between_days, height) )
    
    n_days_td = pd.Timedelta(days=n_days)
    ax.add_patch(patches.Rectangle((x0 + n_days_td, y0),
                                   days_remainder,
                                   height))
    
    ax.hlines(y0 + height / 2,
              xmin=df.Start.min(),
              xmax=x0,
              color='k', linestyles=':', linewidth=0.5)

#DateFormatter required as we're building the plot using patches,
#rather than supplying entire series    
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
ax.xaxis.set_major_locator(mdates.DayLocator())
plt.xticks(rotation=30)

ax.set_xlabel('Date')
ax.set_ylabel('Task')
ax.set_yticks(range(1, len(df) + 1))
ax.set_yticklabels(df.Task)
plt.show()
some3128
  • 1,430
  • 1
  • 2
  • 8
  • Thank you so much! I've tried out your code, but somehow it doesn't display the individual dates underneath the x-axis. It just shows the label "Date" and empty strokes above it. ax.set_xticklabels seems to cause this UserWarning: FixedFormatter should only be used together with FixedLocator. Maybe that's the reason why the dates are not being displayed? Do you maybe know how to fix this? – lunamaria Aug 11 '23 at 22:54
  • I also wanted to ask, whether it is possible to create a chart using your code that displays a rectangle for every single day of a task. So instead of displaying one big rectangle for every task from the start to end date, the chart should display one rectangle for every single day of the task (with a width of 0.1 days). The df I would like to use looks similar to this one: ` df = pd.DataFrame( {'Task': ['1', '2', '2', '2', '2', '3', '3', '3'], 'Date': ['2023-03-15', '2023-03-03', '2023-03-04', '2023-03-05','2023-03-07','2023-03-09', '2023-03-11', '2023-03-12'], } ) ` – lunamaria Aug 12 '23 at 00:04
  • You're welcome. I've updated my answer with an attempt at the features you described. – some3128 Aug 12 '23 at 10:53
0

If you want to display less than 1 day data in a plotly Gantt chart, the original data should have time data and change the time unit on the x-axis. The default is 1 day units. To make it a half-day unit, for example, set dtick to 43200000(1000msec\*60sec\*60min\*12).

import pandas as pd
import plotly.express as px

df = pd.DataFrame([
    dict(Task="1", Start='2023-03-15 00:00:00', End='2023-03-15 12:00:00'),
    dict(Task="2", Start='2023-03-03', End='2023-03-10'),
    dict(Task="3", Start='2023-03-10', End='2023-03-15'),
])

fig = px.timeline(df, x_start="Start", x_end="End", y="Task")
fig.update_yaxes(autorange="reversed")
fig.update_xaxes(dtick='43200000', tickangle=90, tickformat='%m/%d %H:%M')
fig.show()

enter image description here

r-beginners
  • 31,170
  • 3
  • 14
  • 32