1

I get a clickstream with a fields user_id , site_id, datetime - timestamp time of click Session counts all clicks in the interval of 30 mins

How can I make SQL query with output of

user_id    site_id    session_start_time    session_end_time
robsiemb
  • 6,157
  • 7
  • 32
  • 46
Gani
  • 29
  • 4
  • 2
    Sample data and expected results would help understanding your question better. – GMB Nov 12 '19 at 22:47
  • 1
    what database are you using? – Caius Jard Nov 12 '19 at 22:49
  • postgresSQL @CaiusJard – Gani Nov 12 '19 at 22:58
  • Welcome to SO, Gani! To get better help, you should read [Why should I provide a Minimal Reproducible Example for a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query) and then edit your post with the example. – robsiemb Nov 13 '19 at 01:23

1 Answers1

0

OK, so assuming that you're getting some data like

userid, siteid, datetime
1,      1,      2000-01-01 00:30
1,      1,      2000-01-01 01:00
1,      2,      2001-01-01 01:00
1,      2,      2001-01-01 01:30
2,      2,      2002-01-01 02:00
2,      2,      2002-01-01 02:30

And you're looking to make the date from the previous row be the start, and the date from this row be the end:

SELECT 
  userid,
  siteid,
  LAG(datetime) OVER(PARTITION BY siteid, userid ORDER BY datetime) as starttime, datetime as endtime 
FROM 
  table

userid, siteid, starttime,        endtime
1,      1,      null,             2000-01-01 00:30
1,      1,      2000-01-01 00:30, 2000-01-01 01:00
1,      2,      null,             2001-01-01 01:00
1,      2,      2001-01-01 01:00, 2001-01-01 01:30
2,      2,      null,             2002-01-01 02:00
2,      2,      2002-01-01 02:00, 2002-01-01 02:30

Each date from a particular row x will become the start date for the following row x+1. The partition means that user and site id have their own start/end date. If this isn't intended, adjust the partition

To hide the nulls:

WITH cte AS(
 SELECT 
  userid,
  siteid,
  LAG(datetime) OVER(PARTITION BY siteid, userid ORDER BY datetime) as starttime, datetime as endtime 
 FROM 
  table
)
SELECT * FROM cte WHERE starttime IS NOT NULL
Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • I want to get userid siteid starttime endtime `1, 1, 2000-01-01 00:30, 2000-01-01 01:00 1, 1, 2000-01-01 01:00, 2001-01-01 01:30 1, 2, 2001-01-01 01:00, 2001-01-01 01:30` – Gani Nov 12 '19 at 23:41
  • Ok so just surround it all with another query havin a where clause where starttime is not null? – Caius Jard Nov 12 '19 at 23:46
  • Yes I want the start time to be the first datetime since it would be the first interation(click of a user ) – Gani Nov 12 '19 at 23:49
  • I added an example of how to use a cte to hide the nulls – Caius Jard Nov 13 '19 at 07:29