I have the following data:
I want to create a gannt chart that would represent a timeline in python. I looked up another post that had a similar problem but the code didn't work out for me (How to get gantt plot using matplotlib) and I can't solve the issue on my own. It seems like it has something to do with the data type of my "time" values. Here is the code itself:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('zpp00141_new.csv')
df.dropna(subset=['Latest finish / time', 'Earl. start / time'])
#error when I try to change data type of the columns to int
df["Latest finish / time"]= df["Latest finish / time"].astype(int)
df["Earl. start / time"]= df["Earl. start / time"].astype(int)
#error below with data types
df["Diff"] = df['Latest finish / time'] - df['Earl. start / time']
color = {"In":"turquoise", "Out":"crimson"}
fig,ax=plt.subplots(figsize=(6,3))
labels=[]
for i, task in enumerate(df.groupby("Operation/Activity")):
labels.append(task[0])
for r in task[1].groupby("Operation short text"):
data = r[1][["Earl. start / time", "Diff"]]
ax.broken_barh(data.values, (i-0.4,0.8), color=color[r[0]] )
ax.set_yticks(range(len(labels)))
ax.set_yticklabels(labels)
ax.set_xlabel("time [ms]")
plt.tight_layout()
plt.show()
I tried to convert data type from object to "int" for the columns but it prompted another error: "invalid literal for int() with base 10: '9:22:00 AM'". I would really appreciate any assistance in this matter as I am quite new to programming in python. If there is a simpler and better way to represent what I need, it would be helpful if you could provide any tips. Basically, I need a gannt chart to represent each activity on the "timeline" from 7 am to 4:30 pm and reflect "now" time as a vertical line over the chart to indicate where we are now.