1

I am trying to generate SessionId from UserId and difference in timestamps. The first user in the table is obviously assigned 1 and from there the sessionId has to be incremented by 1 if the same user logs in and the difference between current login and previous login is less than 5 minutes else it has to be incremented by 1. I am able to do this using for loop and nested if statements but wondering if it can be done by using dplyr or similar packages. There is already a similar question here but it sorts the UserId's and I want to achieve this without sorting UserId's.

Input and Output(SessionId) is in the same table.
 CustomerID           TimeStamp SessionID
        101 0000-01-01 01:00:00         1
        101 0000-01-01 01:03:00         1
        102 0000-01-01 01:05:00         2
        103 0000-01-01 01:06:00         3
        104 0000-01-01 01:06:00         4
        101 0000-01-01 01:09:00         5
        105 0000-01-01 01:10:00         6
        104 0000-01-01 01:10:00         4
        106 0000-01-01 01:11:00         7
        105 0000-01-01 01:12:00         6
        104 0000-01-01 01:12:00         4
        104 0000-01-01 01:18:00         8
        104 0000-01-01 01:20:00         8
        105 0000-01-01 01:21:00         9
        104 0000-01-01 01:26:00        10
Jason Mathews
  • 765
  • 3
  • 13

1 Answers1

1

In your case you can do group_by and convert the time different within the group by cumsum (this is one of the additional key help to determine the final group number) , then we using factor and as.numeric

df = df %>% group_by(CustomerID) %>% 
      mutate(x=paste(CustomerID, cumsum(c(F, diff(TimeStamp))>5)))
df$x = as.numeric(factor(df$x, as.character(unique(df$x))))
df
# A tibble: 15 x 4
# Groups:   CustomerID [6]
   CustomerID           TimeStamp SessionID     x
        <int>              <dttm>     <int> <dbl>
 1        101 0000-01-01 01:00:00         1     1
 2        101 0000-01-01 01:03:00         1     1
 3        102 0000-01-01 01:05:00         2     2
 4        103 0000-01-01 01:06:00         3     3
 5        104 0000-01-01 01:06:00         4     4
 6        101 0000-01-01 01:09:00         5     5
 7        105 0000-01-01 01:10:00         6     6
 8        104 0000-01-01 01:10:00         4     4
 9        106 0000-01-01 01:11:00         7     7
10        105 0000-01-01 01:12:00         6     6
11        104 0000-01-01 01:12:00         4     4
12        104 0000-01-01 01:18:00         8     8
13        104 0000-01-01 01:20:00         8     8
14        105 0000-01-01 01:21:00         9     9
15        104 0000-01-01 01:26:00        10    10
BENY
  • 317,841
  • 20
  • 164
  • 234