4

I am arriving at the total number of days a service has been used in a month. (Start_Date and End_Date are - both inclusive)

Sample Data 1:

User  Start_Date     End_Date
A     01-Jun-2017    30-Jun-2017
B     06-Jun-2017    30-Jun-2017

Ans: Service used days = 30 days.

Sample Data 2:

User  Start_Date     End_Date
C     06-Jun-2017    10-Jun-2017
D     02-Jun-2017    02-Jun-2017

Ans: Service used days = 6 days.

How do I write a code to find the same, preferable in SQL to PLSQL.

Raghu
  • 83
  • 1
  • 5
  • can you explain the logic needed? – Vamsi Prabhala Jun 22 '17 at 17:17
  • Didnt understand logic to derive (service used days). Sample data for user 'A' is 30 days but for B its 25 days. In your question you have just mentioned 30' – Pravin Satav Jun 22 '17 at 17:20
  • from what you explain it's a basic difference between end-date and start_date, or it's not explained correctly. select user, (end_date - start_date) as "Service Used" from table. – Cyrille MODIANO Jun 22 '17 at 18:37
  • Are the start_date and end_date always in the month you are interested in? Or can you have start_date = 29-May-2017, end_date = 03-Jun-2017, covering three days in June? If these are possible, the question is more interesting. Also: is there another column that shows WHAT is in use (perhaps which piece of equipment, or which room, etc.) - and does the solution need to get you the answer BY PIECE OF EQUIPMENT or BY ROOM etc.? –  Jun 22 '17 at 19:27

2 Answers2

5

Test Data:

CREATE TABLE your_table ( usr, start_date, end_date ) AS (
  SELECT 'A', DATE '2017-06-01', DATE '2017-06-03' FROM DUAL UNION ALL
  SELECT 'B', DATE '2017-06-02', DATE '2017-06-04' FROM DUAL UNION ALL -- Overlaps previous
  SELECT 'C', DATE '2017-06-06', DATE '2017-06-06' FROM DUAL UNION ALL
  SELECT 'D', DATE '2017-06-07', DATE '2017-06-07' FROM DUAL UNION ALL -- Adjacent to previous
  SELECT 'E', DATE '2017-06-11', DATE '2017-06-20' FROM DUAL UNION ALL
  SELECT 'F', DATE '2017-06-14', DATE '2017-06-15' FROM DUAL UNION ALL -- Within previous
  SELECT 'G', DATE '2017-06-22', DATE '2017-06-25' FROM DUAL UNION ALL
  SELECT 'H', DATE '2017-06-24', DATE '2017-06-28' FROM DUAL UNION ALL -- Overlaps previous and next
  SELECT 'I', DATE '2017-06-27', DATE '2017-06-30' FROM DUAL UNION ALL
  SELECT 'J', DATE '2017-06-27', DATE '2017-06-28' FROM DUAL;          -- Within H and I          

Query:

SELECT SUM( days ) AS total_days
FROM   (
  SELECT dt - LAG( dt ) OVER ( ORDER BY dt ) + 1 AS days,
         start_end
  FROM   (
    SELECT dt,
           CASE SUM( value ) OVER ( ORDER BY dt ASC, value DESC, ROWNUM ) * value
             WHEN 1 THEN 'start'
             WHEN 0 THEN 'end'
           END AS start_end
    FROM   your_table
    UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )
  )
  WHERE start_end IS NOT NULL
)
WHERE start_end = 'end';

Output:

TOTAL_DAYS
----------
        25

Explanation:

SELECT dt, value
FROM   your_table
UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )

This will UNPIVOT the table so that the start and end dates are in the same column (dt) and are given a corresponding value of +1 for a start and -1 for an end date.

SELECT dt,
       SUM( value ) OVER ( ORDER BY dt ASC, value DESC, ROWNUM ) AS total,
       value
FROM   your_table
UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )

Will give the start and end dates and the cumulative sum of those generated values. The start of a range will always have value=1 and total=1 and the end of a range will always have total=0. If a date is mid-way through a range then it will either have total>1 or value=-1 and total=1. Using this, if you multiply value and total then the start of a range is when value*total=1 and the end of a range is when value*total=0 and any other value indicates a date that is midway through a range.

Which is what this gives:

SELECT dt,
       CASE SUM( value ) OVER ( ORDER BY dt ASC, value DESC, ROWNUM ) * value
         WHEN 1 THEN 'start'
         WHEN 0 THEN 'end'
       END AS start_end
FROM   your_table
UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )

You can then filter out the dates when the start_end is NULL which will leave you with a table with alternating start and end rows which you can use LAG to calculate the number of days difference:

SELECT dt - LAG( dt ) OVER ( ORDER BY dt ) + 1 AS days,
       start_end
FROM   (
  SELECT dt,
         CASE SUM( value ) OVER ( ORDER BY dt ASC, value DESC, ROWNUM ) * value
           WHEN 1 THEN 'start'
           WHEN 0 THEN 'end'
         END AS start_end
  FROM   your_table
  UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )
)
WHERE start_end IS NOT NULL

All you need to do then is to SUM all the differences for the end - start; which gives the query above.

MT0
  • 143,790
  • 11
  • 59
  • 117
  • You understood my requirement perfectly @MT0. The test case could not be better. Thanks champ. – Raghu Jun 22 '17 at 20:52
  • Hey, can you help me to make the code work with the following data: (the start and/or end date of two rows are the same) SET 1: A 1-Jun-2017 3-Jun-2017 B 1-Jun-2017 2-Jun-2017 Result:NULL Expected result:3 SET 2: A 1-Jun-2017 3-Jun-2017 B 1-Jun-2017 3-Jun-2017 Result:1 Expected result:3 – Raghu Jul 06 '17 at 15:49
  • @Raghu Updated - good catch on those cases. Just need to add `ROWNUM` to the `ORDER BY` clause of the inner most `SUM` so that the value is calculated per row (and not grouped for same dates and start/end value). – MT0 Jul 06 '17 at 21:26
1

As @Pravin Satav addressed, your requirement it's not very clear, something like this is what I understood from your explanation:

SELECT sum(CASE WHEN end_date=start_date THEN 1 ELSE (end_date-start_date)+1 END) as total_days
FROM my_table
WHERE <conditions that determine your "sample data">;
Jair Hernandez
  • 494
  • 3
  • 7