So, I want to calculate the time differences; and the file looks something like this
id | message_id | send_date | status |
---|---|---|---|
0 | 5f74b996a2b7e | 2020-10-01 00:00:07 | sent |
1 | 5f74b996a2b7e | 2020-10-01 00:00:09 | delivered |
2 | 5f74b99e85b3c | 2020-10-02 02:00:14 | sent |
3 | 5f74b99e85b3c | 2020-10-02 02:01:16 | delivered |
4 | 5f74b99e85b3c | 2020-10-02 08:06:49 | read |
5 | 5f74b996a2b7e | 2020-10-02 15:16:32 | read |
6 | 5f9d97ff1af9e | 2020-10-14 13:45:43 | sent |
7 | 5f9d97ff1af9e | 2020-10-14 13:45:45 | delivered |
8 | 5f9d97ff1af9e | 2020-10-14 13:50:48 | read |
9 | 5f74b9a35b6c5 | 2020-10-16 19:01:19 | sent |
10 | 5f74b9a35b6c5 | 2020-10-16 19:01:25 | deleted |
Inside is id which increment, message_id is unique to each message, send_date is the time, status is the message status (it has 5 statuses; sent, delivered, read, failed, and deleted).
I wanted to calculate the time differences when the message was sent then delivered, if delivered then read.
I know something like this can be handy, but I wasn't sure how to assign it uniquely to each of the message_id
from datetime import datetime
s1 = '2020-10-14 13:45:45'
s2 = '2020-10-14 13:50:48' # for example
FMT = '%Y-%m-%d %H:%M:%S'
tdelta = datetime.strptime(s2, FMT) - datetime.strptime(s1, FMT)
print(tdelta)
Ref: https://stackoverflow.com/questions/3096953/how-to-calculate-the-time-interval-between-two-time-strings
The expected output would be,
message_id | delivered_diff | read_diff | deleted_diff | |
---|---|---|---|---|
0 | 5f74b996a2b7e | 00:00:02 | 1 day, 15:16:23 | |
1 | 5f74b99e85b3c | 00:01:02 | 6:05:33 | |
2 | 5f9d97ff1af9e | 00:00:02 | 0:05:03 | |
3 | 5f74b9a35b6c5 | 0:00:06 |