1

current data & output

I'm trying to use dense_rank() function over the pagename column after the data is ordered by time_id. Expected output in rank column, rn, is: [1,2,2,3,4].

Currently I wrote it as:

with tbl2 as 
(select UID, pagename, date_id, time_id, source--, dense_rank() over(partition by UID order by pagename) as rn 
 from tbl1
 order by time_id)

select *, dense_rank() over(partition by UID order by time_id, pagename) as rn
from tbl2 

Any help would be appreciated

Edit 1: What I am trying to achieve here is to give ranks, as per the user on-screen action flow, to the pages that are visited. Suppose if the same page 'A' is visited back after visiting a different page 'B' then the ranks for these page visits A, B, A will be 1,2,3 (note that the same page A has different ranks 1 & 3)

  • Please provide data and table structure as text, not images. – Jim Jones Feb 26 '21 at 09:22
  • @MarkRotteveel, I thought that since this is applicable to both of them (as I have used both), I tagged these. But took your suggestion and tagged only postgres which I used for this – Aravind_005 Feb 26 '21 at 10:12
  • @JimJones, table column names would suffice for this purpose, I think. And the code is shared for the same purpose and also to show my work – Aravind_005 Feb 26 '21 at 10:14

3 Answers3

1

step-by-step demo:db<>fiddle

SELECT 
    *, 
    SUM(is_diff) OVER (ORDER BY date_id, time_id, page) 
FROM (
    SELECT 
        *,
        CASE WHEN page = lag(page) over (order by date_id, time_id) THEN 0 ELSE 1 END as is_diff
    FROM mytable 
)s

This looks exactly like a problem I asked some years ago: Window functions: PARTITION BY one column after ORDER BY another

You want to execute a window function on columns (uuid, page) but want to keep the current order which is given by unrelated columns (date_id, time_id).

The problem is, that PARTITION BY orders the records before the ORDER BY clause. So, it defines the primary order and this is not expected.

Once I found a solution for that. I adapted it to your used case. Please read the explanation over there: https://stackoverflow.com/a/52439794/3984221

Interesting part: Your special rank() case is not explicitly required in the query, because my solution creates that out-of-the-box ("by accident" so-to-speak ;) ).

S-Man
  • 22,521
  • 7
  • 40
  • 63
  • Seen your shared post and understood the code. I think this should solve mine too. Will adapt this and let you know. Thanks for the reference post (Y) – Aravind_005 Feb 26 '21 at 10:10
0

You can use DENSE_RANK() like this for your requirment,

  SELECT
     u_id,
     page_name,
     date_id,
     time_id,
     source,
     DENSE_RANK()
  OVER (
       PARTITION BY page_name
       ORDER BY u_id DESC
  ) rn 
  FROM ( SELECT * FROM tbl1 ORDER BY time_id ) AS result;
Ramil Aliyev 007
  • 4,437
  • 2
  • 31
  • 47
onkaram
  • 560
  • 2
  • 7
0

Hmmm . . . If you want the pages ordered by their earliest time, then use two levels of window functions:

select t.*,
       dense_rank() over (partition by uid order by min_rn, pagename) as ranking
from (select t.*,
             min(rn) over (partition by uid, pagename) as min_rn
      from t
     ) t

Note: This uses rn as a convenient shortcut because the date/time is split into two columns. You can also combine them:

select t.*,
       dense_rank() over (partition by uid order by min_dt, pagename) as ranking
from (select t.*,
             min(date_id || time_id) over (partition by uid, pagename) as min_dt
      from t
     ) t;

Note: This solution is different from S_man's. On your sample data, they do the same thing. However, if the user returns to a page, then his gives page a new ranking. This gives the page the same ranking as the first time it appears. It is not clear what you really want.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I am giving rank based on the user page flow actions. So if the user comes back to the same page after a different page visit then the rank of this page visit should be different although the pagename is same – Aravind_005 Feb 28 '21 at 07:29
  • Will take your suggestion and be more specific from next time. Thank you – Aravind_005 Feb 28 '21 at 07:29
  • @Aravind_005 . . . In that case S-Man's answer is the best approach. That wasn't clear from your question. – Gordon Linoff Feb 28 '21 at 13:16