I have data which looks like this:
USER_ID ROLE SESSION_ID TIMESTAMP
f684d34a VP Marketing 890485E 51:16.2
f684d34a VP Marketing 890485E 51:13.5
f684d34a VP Marketing 890485E 51:39.3
f684d34a VP Marketing 890485E 51:32.2
f684d34a VP Marketing 890485E 51:20.0
f684d34a VP, Marketing 890485E 49:52.9
f684d34a VP Marketing 890485E 51:17.6
f684d34a VP Marketing 890485E 52:14.2
f684d34a VP Marketing 890485E 51:41.3
f684d34a VP Marketing 890485f 51:41.1
f684d34a VP Marketing 890485f 51:41.4
This is a subset, but for every user_id there could be multiple SESSION_ID
and multiple ROLE
.
I want to get distinct combinations of user_id and session_id as well as role, but when the role is different when the user_id and session_id is the same, take the oldest role, indicated by timestamp.
Final output:
USER_ID ROLE SESSION_ID TIMESTAMP
f684d34a VP, Marketing 890485E 49:52.9
f684d34a VP Marketing 890485f 51:41.1
So as u can see in entry 1, all the instances of user_id and session_id are distinct, but I also wanted role so I took the role of the oldest timestamp for this combination.