I have a list of tickets with data on: ticket name, created date, status, closed date.
A new column will be calculated based on created / closed date. If a new ticket with the status = open is created that month, the new column value will increase by one. Value will decrease by one on the month the ticket is moved to a closed status.
How would I set up a Df with the index described above, and how would I go about doing the cumulative calculation in pandas? I'm specifially struggling with setting index as a time series of dates and and having the issues show up on the correct row Starting Data:
ID Created Date Closed Date
0 FND-1974 2021-10-18 00:00:00 2022-03-31
1 FND-10310 2021-10-18 00:00:00 2022-03-31
2 FND-10310 2021-10-18 00:00:00 2022-03-31
3 FND-10310 2021-07-21 00:00:00 NaT
4 FND-9862 2021-07-20 00:00:00 2022-02-28
.. ... ... ...
100 41 2020-04-13 13:34:39 NaT
101 40 2020-04-13 13:32:14 NaT
102 35 2020-04-01 17:48:23 NaT
103 18 2020-01-21 16:08:54 NaT
104 4 2020-02-25 14:56:37 NaT
Current approach:
df = pd.DataFrame(index= pd.Series(pd.date_range('2021-7-1', dt.date.today(),freq="D")))
df['ID'] = df_agg['Exception_ID']
df['Created Date'] = df_agg['Created_On_Date']
df['Closed Date'] = df_agg['Closed_Date']
df['count'] = 0
for index, row in df.iterrows():
if index >= row['Created Date']:
row['count'] += 1
if index >= row['Closed Date']:
row['count'] -= 1
print(df.head)
Output:
ID Created Date Closed Date count
2021-07-01 NaN NaT NaT 0
2021-07-02 NaN NaT NaT 0
2021-07-03 NaN NaT NaT 0
2021-07-04 NaN NaT NaT 0
2021-07-05 NaN NaT NaT 0
... ... ... ...
2022-03-20 NaN NaT NaT 0
2022-03-21 NaN NaT NaT 0
2022-03-22 NaN NaT NaT 0
2022-03-23 NaN NaT NaT 0
2022-03-24 NaN NaT NaT 0
Obviously, I would like to populate the row when the ID, Created Date, Closed Date and add 1 to our count when an issue has been opened on a specific day. im losing it trying to figure this out