1

I have a page audit table that records which pages a user has accessed. Given an specific page, I need to find what previous page the user has accessed and what was the most accessed.

For example, the FAQ Page_ID is 3. I want to know if it is more frequently accessed from the First Access page (ID 1) or Home page (ID 5).

Example:

Page Audit Table (SQL Server)

ID  | Page_ID | User_ID
1   | 1       | 6
2   | 3       | 6
3   | 5       | 4
4   | 3       | 4
5   | 1       | 7
6   | 3       | 7
7   | 1       | 5
8   | 3       | 2        --Note: previous user is not 2
9   | 3       | 5        --Note: previous page for user 5 is 1 and not 3

Looking for Page_ID = 3, I want to retrieve:

Previous Page | Count
1             | 3
5             | 1

Note: I've looked some similar questions here (like that one), but it didn't help me to solve this problem.

Community
  • 1
  • 1
sqlover90
  • 48
  • 7

3 Answers3

3

You can use window functions as one way to figure this out:

with UserPage as (
  select
    User_ID,
    Page_ID,
    row_number() over (partition by User_ID order by ID) as rn
  from
    PageAudit
)
select
  p1.Page_ID,
  count(*)
from
  UserPage p1
    inner join
  UserPage p2
    on p1.User_ID = p2.User_ID and
       p1.rn + 1 = p2.rn
where
  p2.Page_ID = 3
group by
  p1.Page_ID;

SQLFiddle Demo

If you have SQL2012, the answers using lag will be a lot more efficient. This one works on SQL2008 too.

For reference, as I think one of the lag solutions is over complicated, and one is wrong:

with prev as (
  select
    page_id,
    lag(page_id,1) over (partition by user_id order by id) as prev_page
  from 
    PageAudit
)
select
  prev_page,
  count(*)
from
  prev
where
  page_id = 3 and
  prev_page is not null -- people who landed on page 3 without a previous page
group by 
  prev_page

SQLFiddle Example of Lag

Laurence
  • 10,896
  • 1
  • 25
  • 34
  • Thank you very much for your clear and complete answer. Just one suggestion: your second fiddle is missing the `and prev_page is not null` – sqlover90 Aug 25 '14 at 02:14
2
select prev_page, count(*)
  from (select id,
               page_id,
               user_id,
               lag(page_id, 1) over(partition by user_id order by id) as prev_page
          from page_audit_table) x
 where page_id = 3
   and prev_page <> page_id
 group by prev_page

Fiddle: http://sqlfiddle.com/#!6/c0037/23/0

Brian DeMilia
  • 13,103
  • 1
  • 23
  • 33
  • 1
    I think you want `partition by user_id` in there. e.g. http://sqlfiddle.com/#!6/7f8c8/1 2 shouldn't appear as a previous page. – Laurence Aug 25 '14 at 01:19
1

You could use the LAG function (It is available only in MS SQL Server 2012+).

Test with this fiddle.

Query:

SELECT
    previous_page, count(previous_page) as count
FROM
    (SELECT
        Page_id,
        LAG(Page_ID, 1, NULL) OVER (PARTITION BY User_ID ORDER BY ID) as previous_page,
        User_ID as current_usr,
        LAG(User_ID, 1, NULL) OVER (PARTITION BY User_ID ORDER BY ID) as previous_usr
    FROM
        Page_Audit) p
WHERE
    Page_ID = 3 AND current_usr = previous_usr
GROUP BY
    previous_page
ORDER BY
    count DESC
Zanon
  • 29,231
  • 20
  • 113
  • 126