0

Oracle version: 12c.

I have the below table:

CAT_PROD    FROM_DT     TO_DT       IMP
A1          15/01/2023  07/02/2023  100
A2          13/01/2023  16/01/2023  100

And I would like to create the below output:

CAT_PROD    RANGE_DT    IMP_RANGE   EXPLANATION
A1          202301      70,83       There are 17 days between 15th Jan (included) and 31st Jan. 17*100/24 = 70.83. 24 is the number of days between 15th Jan and 7th Feb.
A1          202302      29,17       There are 7 days between 1st Feb (included) and 7th Feb. 7*100/24 = 29.17. 24 is the number of days between 15th Jan and 7th Feb.
A2          202301      100         There are 4 days between 13th Jan (included) and 16th Jan. 4*100/4= 100. 4 is the number of days between 13th Jan and 16th Jan.

Criteria:

  1. IMP is prorated amongs the days of the month in which it falls.
  2. RANGE_DT is the YYYYMM for each combination of days in the month.
  3. The combination CAT_PROD and RANGE_DT should be unique.

EDIT: This is what I have tried, but it is very very slow when having more than a few rows in the original data.

WITH aux(cat_prod, startdate, enddate, imp) AS
  (SELECT 'A1' , DATE'2023-01-15' , DATE'2023-02-07' , 100 from dual
  
  UNION
  
  SELECT 'A2' , DATE'2023-01-13' , DATE'2023-01-16' , 100 from dual
  ),
  apply_cross as
  (select e.cat_prod,
    e.imp,
    enddate-startdate + 1 total_days_range,
    case
      when e.startdate > x.s_date
      then e.startdate
      else x.s_date
    end as start_date,
    case
      when e.enddate < x.e_date
      then e.enddate
      else x.e_date
    end as end_date
  from aux e cross apply
    (select trunc( e.startdate, 'mm') + (level-1) * interval '1' month         as s_date,
      trunc( e.startdate              + (level) * interval '1' month, 'mm') -1 as e_date
    from dual
      connect by level <= months_between( trunc( e.enddate, 'mm'),trunc( e.startdate, 'mm')) + 1
    ) x
  )
select ac.cat_prod,
  to_char(start_date, 'YYYYMM') month_id,
  round(imp*(end_date-start_date+1)/total_days_range, 2) imp_prorate
from apply_cross ac;
Javi Torre
  • 724
  • 8
  • 23
  • 1
    "I want" is no question and SO is no code writing service. Please tell us what you have tried, what exactly went wrong and what exactly you need to know. – Jonas Metzler Feb 07 '23 at 07:45
  • Apologies, English is not my native language. I have edited my question. @JonasMetzler – Javi Torre Feb 07 '23 at 08:30
  • Thanks for adding the query you wrote. Is the "only" issue that it is executed too slow or did you encounter incorrect results? It's difficult to say why a query is slow without access to the DB. Did you check the execution plan? One problem could be you are using UNION. UNION ALL is often faster. – Jonas Metzler Feb 07 '23 at 08:55

2 Answers2

1

You can use a recursive sub-query factoring clause to split each range into months (as some tests indicate that recursive queries are faster than hierarchical queries):

WITH months (cat_prod, startdate, monthend, enddate, day_rate) AS (
  SELECT cat_prod,
         startdate,
         LEAST(ADD_MONTHS(TRUNC(startdate, 'MM'), 1), enddate + 1),
         enddate + 1,
         imp / (enddate + 1 - startdate)
  FROM   aux
UNION ALL
  SELECT cat_prod,
         monthend,
         LEAST(ADD_MONTHS(monthend, 1), enddate),
         enddate,
         day_rate
  FROM   months
  WHERE  monthend < enddate
)
SEARCH DEPTH FIRST BY cat_prod, startdate SET order_id
SELECT cat_prod,
       TO_CHAR(startdate, 'YYYYMM') AS range_dt,
       ROUND((monthend - startdate) * day_rate, 2) AS imp_range
FROM   months;

Which, for the sample data:

CREATE TABLE aux(cat_prod, startdate, enddate, imp) AS
  SELECT 'A1', DATE '2023-01-15', DATE '2023-02-07', 100 FROM DUAL UNION ALL
  SELECT 'A2', DATE '2023-01-13', DATE '2023-01-16', 100 FROM DUAL;

Outputs:

CAT_PROD RANGE_DT IMP_RANGE
A1 202301 70.83
A1 202302 29.17
A2 202301 100

fiddle

MT0
  • 143,790
  • 11
  • 59
  • 117
  • @p3consulting [fiddle](https://dbfiddle.uk/v4B2oOP1) works fine for multiple months. – MT0 Feb 07 '23 at 12:56
  • Thansk @mt0 could you please briefly explain how that recursion is working? – Javi Torre Feb 07 '23 at 16:53
  • 1
    @JaviTorre In the initial part of the recursive query, `ADD_MONTHS(TRUNC(startdate, 'MM'), 1)` truncates the start date to the beginning of the month and then adds 1 month so that it is the beginning of the next month; then in the recursive part (after the `UNION ALL`), `ADD_MONTHS(monthend, 1)` adds 1 month to the end of the previous month to get the end of the next month and `WHERE monthend < enddate` is used to stop the recursion at the appropriate point. `LEAST` is used so that the date in the final month does not go past the `enddate`. – MT0 Feb 07 '23 at 18:13
0

Try this one:

WITH rdata(rn, cat_prod, from_dt, to_dt, imp) AS (
    SELECT row_number() OVER(PARTITION BY cat_prod ORDER BY from_dt, to_dt), d.* FROM the_table d
),
all_ym(cat_prod, range_dt, imp_range) as (
    SELECT cat_prod,  
        TO_CHAR(ADD_MONTHS(from_dt, LEVEL - 1), 'YYYYMM'),
        ROUND( 
            imp*(
                LEAST(to_dt, LAST_DAY(TRUNC(ADD_MONTHS(from_dt, LEVEL-1),'MONTH')))
                -
                GREATEST(from_dt, TRUNC(ADD_MONTHS(from_dt,LEVEL-1),'MONTH')) + 1)
                / (to_dt - from_dt + 1),
            2)
    FROM rdata
    CONNECT BY LEVEL <= (MONTHS_BETWEEN(TRUNC(to_dt,'MONTH'),TRUNC(from_dt,'MONTH'))+1) 
        AND PRIOR cat_prod = cat_prod AND PRIOR rn = rn 
        AND PRIOR sys_guid() IS NOT NULL
)
SELECT d.cat_prod, d.range_dt, d.imp_range --, d.nmonths
FROM all_ym d
;

tested on >100K random data, (on the same data solution with cross apply generates ORA-01839 and solution with recursive CTE returns incorrect data when end date is the 29/02 of a bissextile year)

(Note that the specs says date ranges are unique par category, but nothing about overlaps... may still have to improve)

p3consulting
  • 2,721
  • 2
  • 12
  • 10
  • 1
    Generating a GUID for each row you are generating is an expensive operation. There are ways to use hierarchical queries and prevent excessive duplicates without having to apply such "hacks" as generating GUIDs. (The OP uses one example with `CROSS APPLY`) – MT0 Feb 07 '23 at 13:01
  • Tested on random 100K rows, the difference between the 2 is negligible, both giving equivalent execution times ±10%, randomly one faster than the other. – p3consulting Feb 08 '23 at 09:26