5

I have a subset of a SQL Server 2008 R2 table like this:

cust_id | prod_id | day | price
--------+---------+-----+-------
137656    194528   42373   9.11
137656    194528   42374   9.11
137656    194528   42375   9.61
137656    194528   42376   9.61
137656    194528   42377   9.11
137656    194528   42378   9.11

I need to rank the different price periods like so:

cust_id | prod_id | day | price | rank
--------+---------+-----+-------+------
137656    194528   42373   9.11     1
137656    194528   42374   9.11     1
137656    194528   42375   9.61     2
137656    194528   42376   9.61     2
137656    194528   42377   9.11     3
137656    194528   42378   9.11     3

so that it sorts by cust_id, prod_id and day ascending but increments the rank when the price changes. I have tried to use DENSE_RANK() like this:

SELECT 
    cust_id, prod_id, [day], price, 
    DENSE_RANK() OVER (ORDER BY cust_id, prod_id, price)
FROM
    @prices 

This returns something like:

cust_id | prod_id | day | price | rank
--------+---------+-----+-------+------
137656    194528   42373   9.11     1
137656    194528   42374   9.11     1
137656    194528   42377   9.11     1
137656    194528   42378   9.11     1
137656    194528   42375   9.61     2
137656    194528   42376   9.61     2

Obviously excluding the day from the sort will give me these results but whenever I include the day in the order by section of the DENSE_RANK() - it just partitions each new day as a new ID....

Does anyone have any ideas on how this should work? Appreciate any advice and can give more info if required

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
samil90
  • 107
  • 6
  • Just as a note, I tried using the [difference in row number method](http://rextester.com/COMB14371), but it failed, so this probably isn't a good approach. – Tim Biegeleisen Nov 20 '17 at 11:21
  • what will be the rank when cust_id or prod_id changed? it would be better if you add that also in your sample. – Abdul Rasheed Nov 20 '17 at 11:35

1 Answers1

1

The first variant with LAG and SUM

SELECT
  *,
  1+SUM(IncCount)OVER(PARTITION BY cust_id ORDER BY [day]) [rank]
  --1+SUM(IncCount)OVER(PARTITION BY cust_id ORDER BY [day] ROWS BETWEEN unbounded preceding AND current row) [rank]
FROM
  (
    SELECT
      *,
      IIF(LAG(price)OVER(PARTITION BY cust_id ORDER BY [day])<>price,1,0) IncCount
      --CASE WHEN LAG(price)OVER(PARTITION BY cust_id ORDER BY [day])<>price THEN 1 ELSE 0 END IncCount
    FROM Test
  ) q

The second variant without LAG

WITH numCTE AS(
  SELECT *,ROW_NUMBER()OVER(PARTITION BY cust_id ORDER BY [day]) RowNum
  FROM Test
)
SELECT
  t1.*,
  1+SUM(CASE WHEN t2.price<>t1.price THEN 1 ELSE 0 END)OVER(PARTITION BY t1.cust_id ORDER BY t1.[day]) [rank]
  --1+SUM(CASE WHEN t2.price<>t1.price THEN 1 ELSE 0 END)OVER(PARTITION BY t1.cust_id ORDER BY t1.[day] ROWS BETWEEN unbounded preceding AND current row) [rank]
FROM numCTE t1
LEFT JOIN numCTE t2 ON t2.RowNum+1=t1.RowNum AND t2.cust_id=t1.cust_id

The third variant with recursive CTE

WITH numCTE AS(
  SELECT *,ROW_NUMBER()OVER(PARTITION BY cust_id ORDER BY [day]) RowNum
  FROM Test
),
rankCTE AS(
  SELECT RowNum,cust_id,prod_id,[day],price,1 [rank]
  FROM numCTE
  WHERE RowNum=1

  UNION ALL

  SELECT
    n.RowNum,n.cust_id,n.prod_id,n.[day],n.price,
    r.[rank]+CASE WHEN n.price<>r.price THEN 1 ELSE 0 END [rank]
  FROM numCTE n
  JOIN rankCTE r ON n.RowNum=r.RowNum+1 AND n.cust_id=r.cust_id
)
SELECT *
FROM rankCTE
OPTION(MAXRECURSION 0)
Sergey Menshov
  • 3,856
  • 2
  • 8
  • 19
  • Thanks for this, it looks like it would work but unfortunately LAG does not work on a sql 2008 R2 database, I should've mentioned this in the initial question, I've updated it now - apologies! – samil90 Nov 20 '17 at 10:48
  • Try the second variant. I hope it works in SQLServer 2008. – Sergey Menshov Nov 20 '17 at 10:56
  • Thanks Leran, I have tested both latter options. Option 2 gives me a 2 for the first row of the new price but then reverts back to 1 - making it hard to group by this id and get a min and max day to give a price range. Option 3 is better as it gives me an incremental rank for each new price but is not taking account of the different cust_ids – samil90 Nov 20 '17 at 11:23
  • I added PARTITION BY cust_id into all three variants. – Sergey Menshov Nov 20 '17 at 11:39
  • It didn't seem to make a difference, however, I can group on cust_id and rank to get the price ranges I need - so this worked perfectly, thankyou! – samil90 Nov 20 '17 at 11:40