I have a dataset with a session id, user id, TimeStamp in UNIX (which I converted using lubridate), and the converted TimeStamp column.
Session | User | ts_UNIX | TimeStamp |
---|---|---|---|
123 | 345 | UNIX Timestamp | 14-06-2022 17:44:32 |
123 | 345 | UNIX Timestamp | 14-06-2022 17:44:33 |
123 | 345 | UNIX Timestamp | 14-06-2022 17:44:37 |
124 | 346 | UNIX Timestamp | 14-06-2022 15:50:10 |
124 | 346 | UNIX Timestamp | 14-06-2022 15:51:01 |
124 | 346 | UNIX Timestamp | 14-06-2022 16:30:00 |
125 | 345 | UNIX Timestamp | 14-06-2022 23:55:30 |
125 | 345 | UNIX Timestamp | 14-06-2022 23:58:50 |
125 | 345 | UNIX Timestamp | 14-06-2022 23:59:45 |
125 | 345 | UNIX Timestamp | 15-06-2022 00:00:32 |
125 | 345 | UNIX Timestamp | 15-06-2022 00:00:59 |
I would like to add another column called session_duration (in seconds) which is the difference between the max_time and min_time grouped by Session and User. For instance, for session # 123 and user 345, the session duration is [14-06-2022 17:44:37] - [14-06-2022 17:44:32] which is 5 seconds.
Session | User | ts_UNIX | TimeStamp | session_duration (seconds) |
---|---|---|---|---|
123 | 345 | UNIX Timestamp | 14-06-2022 17:44:32 | 5 |
123 | 345 | UNIX Timestamp | 14-06-2022 17:44:33 | 5 |
123 | 345 | UNIX Timestamp | 14-06-2022 17:44:37 | 5 |
124 | 346 | UNIX Timestamp | 14-06-2022 15:50:10 | 2390 |
124 | 346 | UNIX Timestamp | 14-06-2022 15:51:01 | 2390 |
124 | 346 | UNIX Timestamp | 14-06-2022 16:30:00 | 2390 |
125 | 345 | UNIX Timestamp | 14-06-2022 23:55:30 | 329 |
125 | 345 | UNIX Timestamp | 14-06-2022 23:58:50 | 329 |
125 | 345 | UNIX Timestamp | 14-06-2022 23:59:45 | 329 |
125 | 345 | UNIX Timestamp | 15-06-2022 00:00:32 | 329 |
125 | 345 | UNIX Timestamp | 15-06-2022 00:00:59 | 329 |
This is what my current code looks like. The timestamp has successfully converted, but I am facing an issue with the session duration column.
library(tidyverse)
library(lubridate)
df <- df %>%
mutate(timestamp = as_datetime(ts_unix/1000)) %>%
group_by (session, user, timestamp) %>%
mutate(session_duration = difftime (max(timestamp), min(timestamp), units = "secs"))
Can someone please help me figure out the session_duration column? Thank you.