I get a clickstream with a fields user_id , site_id, datetime - timestamp time of click Session counts all clicks in the interval of 30 mins
How can I make SQL query with output of
user_id site_id session_start_time session_end_time
I get a clickstream with a fields user_id , site_id, datetime - timestamp time of click Session counts all clicks in the interval of 30 mins
How can I make SQL query with output of
user_id site_id session_start_time session_end_time
OK, so assuming that you're getting some data like
userid, siteid, datetime
1, 1, 2000-01-01 00:30
1, 1, 2000-01-01 01:00
1, 2, 2001-01-01 01:00
1, 2, 2001-01-01 01:30
2, 2, 2002-01-01 02:00
2, 2, 2002-01-01 02:30
And you're looking to make the date from the previous row be the start, and the date from this row be the end:
SELECT
userid,
siteid,
LAG(datetime) OVER(PARTITION BY siteid, userid ORDER BY datetime) as starttime, datetime as endtime
FROM
table
userid, siteid, starttime, endtime
1, 1, null, 2000-01-01 00:30
1, 1, 2000-01-01 00:30, 2000-01-01 01:00
1, 2, null, 2001-01-01 01:00
1, 2, 2001-01-01 01:00, 2001-01-01 01:30
2, 2, null, 2002-01-01 02:00
2, 2, 2002-01-01 02:00, 2002-01-01 02:30
Each date from a particular row x will become the start date for the following row x+1. The partition means that user and site id have their own start/end date. If this isn't intended, adjust the partition
To hide the nulls:
WITH cte AS(
SELECT
userid,
siteid,
LAG(datetime) OVER(PARTITION BY siteid, userid ORDER BY datetime) as starttime, datetime as endtime
FROM
table
)
SELECT * FROM cte WHERE starttime IS NOT NULL