I have a dataframe as below:
CallID | StorageDate | CloseDate | Time Delta |
---|---|---|---|
1 | 2023-02-08 14:35:09 | 2023-02-08 14:35:56 | |
1 | 2023-02-08 14:35:56 | 2023-02-08 14:42:00 | value |
2 | 2023-02-07 10:17:18 | 2023-02-07 10:22:23 | |
2 | 2023-02-07 10:22:23 | 2023-02-07 15:09:14 | |
2 | 2023-02-07 15:09:14 | 2023-02-07 16:20:50 | |
2 | 2023-02-07 16:20:49 | 2023-02-08 09:23:16 | |
2 | 2023-02-08 09:23:16 | 2023-02-08 09:27:21 | value |
3 | 2023-03-10 10:31:25 | 2023-03-10 10:41:37 | |
3 | 2023-03-10 10:41:37 | 2023-03-10 14:23:18 | value |
To achieve the Time Delta, I am doing the following:
delta_time = a.iloc[-1]['CloseDate'] - a.iloc[0]['StorageDate']
I need to subtract the last CloseDate from the first StorageDate for each CallID (a total of 16821), and the delta_time must go in the last row of each CallID, where there is value (the same as I get the CloseDate from).
I'm doing as follows:
callid = 1
while callid <= 16821:
df1 = df1[df1['CallID'] == callid]
delta_time = df1.iloc[-1]['CloseDate'] - df1.iloc[0]['StorageDate']
callid += 1
But the problem is that I'm not being abble to parse the delta_time value to the correct row.
Before I tried doing with loc and iloc, and I managed to send it to the correct row in df1 with the following structure:
delta_time = df1.iloc[-1]['CloseDate'] - df1.iloc[0]['StorageDate']
df1.loc[1, 'Time Delta'] = delta_time
It works, but it's unefficient since I have to change the value inside the loc for every different CallID and iloc[-1] doesn't seem to work. Moreover, I don't know how to parse it to the main dataframe and not only the one I created to do the math.
Can anybody help me here?