2

I have a dataset that contains the following columns and data:

Customer | Week_number | Amount
cust1    |  0          | 100
cust1    |  1          | 200
cust1    |  3          | 300
cust2    |  0          | 1000
cust2    |  1          | 2000

I need to calculate fortnightly totals for each customer.

With the window functions, I am able to do this:

SELECT 
 CUSTOMER, WEEK_NUMBER
, SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS 1 PRECEDING) AS FORTNIGHT_AMOUNT
FROM AMOUNT

But this adds up the amount even if there is no amount for the previous week. In the example above, for cust1, 3rd row, it adds up week 3 and week 1. The amount should only be added if the week_number is 1 less than the current row's week. Is this possible? Thanks for the help.

What I am getting:

Customer | Week_number | Fortnight_Amount
cust1    |  0          | 100
cust1    |  1          | 300
cust1    |  3          | **500**
cust2    |  0          | 1000
cust2    |  1          | 3000

Expected Result:

Customer | Week_number | Fortnight_Amount
cust1    |  0          | 100
cust1    |  1          | 300
cust1    |  3          | **300**
cust2    |  0          | 1000
cust2    |  1          | 3000
cybharg
  • 55
  • 5

4 Answers4

1

If it's only two weeks/rows your query can be further simplified to a single STATS-step in Explain (as both OLAP-functions apply the same PARTITION/ORDER) :

SELECT T.*
, CASE 
    WHEN MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) + 1 = WEEK_NUMBER
    THEN SUM(AMOUNT)      OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
   ELSE AMOUNT
  END AS TWO_WEEK_SUM_AMOUNT
FROM MY_TABLE T
ORDER BY CUSTOMER, WEEK_NUMBER

Of course this assumes that weeks start with 0 and there's no previous year week 52/53.

dnoeth
  • 59,503
  • 4
  • 39
  • 56
  • is there a way this can be used for calculating running totals over 4 weeks.. the conditions are the same.. a week might or might not have a total and the total should be added up only if the week number is between the current week and 3 less than the current.. Thanks.. – cybharg Jan 06 '20 at 07:33
  • You could add a check for the three previous rows and add this individual row (looks ugly, but should be efficient): `CASE WHEN MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) BETWEEN WEEK_NUMBER - 3 AND WEEK_NUMBER THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) ELSE 0 END + ...` Repeat the CASE for rows 2 and 3 preceding. If you're on TD16.10+ you can simolify it using LAG. – dnoeth Jan 06 '20 at 19:27
0

If you just want to ignore week numbers that are not immediatly sequential, you could use lag() first, and then do a window sum():

select
    customer,
    week_number,
    sum(
        case when lag_week_number is null or week_number = lag_week_number + 1 
            then amount
            else 0 
        end
    ) over(partition by customer order by week_number) fortnight_amount
from (
    select 
        t.*, 
        lag(week_number) over(partition by customer order by week_number) lag_week_number
    from mytable t
) t

Actually you might actually want to reset the sum when a there is a gap in week_numbers. For this, which is some kind of gaps-and-islands assignment, you would proceed differently: the idea is to do a cumulative sum to start a new group when two consecutive week numbers ae note sequential, and then sum within each group:

select 
    customer,
    week_number,
    sum(amount) over(partition by customer, grp order by week_date) fortnight_amount
from (
    select 
        t.*,
        sum(
            case 
                when lag_week_number is null or week_number = lag_week_number + 1 
                then 0
                else 1
            end
        ) grp
    from (
        select 
            t.*, 
            lag(week_number) over(partition by customer order by week_number) lag_week_number
        from mytable t
    ) t
) t
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thanks for the answer. I have been trying to run the first answer that you provided and it fails with the error 'Data Type "WEEK_NUMBER" does not match a Defined Type name'. WEEK_NUMBER is an integer.. – cybharg Dec 17 '19 at 01:01
0

You want range partitioning, not row partitioning:

SELECT CUSTOMER, WEEK_NUMBER,
       SUM(AMOUNT) OVER (PARTITION BY CUSTOMER
                         ORDER BY WEEK_NUMBER 
                         RANGE BETWEEN 1 PRECEDING AND CURRENT ROW
                        ) AS FORTNIGHT_AMOUNT
FROM AMOUNT;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Thanks @Gordon and @GMB for both your answers. Unfortunately, I am not able to use both the LAG function or the RANGE partitioning in Teradata SQL. But I was able to use the concepts that you both described to get to the below answer.

SELECT 
CUSTOMER
, WEEK_NUMBER
, LAG_WEEK_NUMBER
, AMOUNT
, CASE 
  WHEN WEEK_NUMBER = LAG_WEEK_NUMBER + 1 
  THEN SUM(AMOUNT) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
  ELSE AMOUNT
END AS TWO_WEEK_SUM_AMOUNT
FROM (
  SELECT 
  T.*
  , MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER
  FROM MY_TABLE T
  ) T
ORDER BY CUSTOMER, WEEK_NUMBER

I was able to get the LAG function implementation in Teradata from @dnoeth's answer in these links:

MAX(WEEK_NUMBER) OVER (PARTITION BY CUSTOMER ORDER BY WEEK_NUMBER ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LAG_WEEK_NUMBER

rows between 1 preceding and preceding 1

Teradata partitioned query ... following rows dynamically

Please let me know if you find any issues with the answer or if it can be improved in any way.

cybharg
  • 55
  • 5