So basically I have a bunch of users that enter in my website and I need them grouped by their sessions. A session is a 30 minutes connection with the same ID since the first login. If it takes more then 30 minutes it's refereed as a new session.
Sample input:
id,timestamp_datetime
1,2020-04-25 21:28:57.499 # Session 1 - first session
1,2020-04-25 21:41:41.691
1,2020-04-25 21:41:11.055
1,2020-04-25 22:00:00.015 # Session 1 - second session (more then 30 minutes)
2,2020-04-25 21:41:41.691 # Session 2 - first session
2,2020-04-25 22:00:00.015
2,2020-04-25 22:30:03.838 # Session 2 - second session
3,2020-04-25 21:41:41.691
Sample output:
id, count_sessions
1, 2
2, 2
3, 1
I have tried this
df.groupby([df.index.to_period('30T'),"id"]).count()
But it gave me the wrong results. Please help me fix it.