1

I could really do with some help and intend to be active in this community and help others in return. I am a SQL developer using MS SQL Server for the last two years but I've hit a roadblock on this one. Imagine the scenario you have a number of "Accommodation Providers". Each has a certain "Service Capacity". We have a dataset with a number of concurrent "Placements" which can be any duration from a day to several years. We would like to know the "Occupancy Rate" by calculating it as

Occupancy = Placement Days (all days in all placements within period) /
(Capacity x Days in Period) X 100

I have changed names of fields/tables and am showing some made-up sample data here.

We have one dataset in a table (tPL) for "Placements". There are many thousands of records, going back 7 years

e.g

tbl_Placements tPL:

[Provder Name] [Name of Client] [Vacancy Filled Date] [Vacancy End Date]
Accommodation1   John Smith          2018-08-04          2018-08-12
Accommodation1    Jane Smith          2019-01-28          2019-04-09

and:

[Placement_Length_in_Days]
          8
          294

tbl_Month_Year tMY:

Month_Year
2018-03-01
2018-04-01
2018-05-01
2018-06-01
2018-07-01
2018-08-01
2018-09-01
2018-10-01
2018-11-01
2018-12-01
2019-01-01
2019-02-01
2019-03-01
2019-04-01
2019-05-01

and lastly

tbl_Service_Capacity tSC:

[Provider Name]     [Service Capacity]

Accommodation 1           12
Accommodation 2            4

Dividing by the service capacity is the easy part. Where I'm struggling is calculating the total number of "Placement Days" in a given period such as a month or quarter.
If you consider that Accommodation1, 2 and 3 can have multiple concurrent and overlapping placements of different lengths which can start and finish at any time, how can I calculate the total number of days in all placements, that fall within a given time period e.g. quarter or a month, to then calculate the occupancy percentage? The code below is an attempt. I'm presuming all months to be 30 days here, which I know is wrong. I know the logic is wrong here about calculating the number of days. To be honest, I'm almost totally fried and I just can't seem to get this done, hence I'm asking for help.

Am I going about this the wrong way by joining on a date table? Has anyone come against this before. Also if you would like me to give you more information or clarify, I'm happy to do so.

Any help you can give will be hugely appreciated!

Please see the code below. I've tried it a few different ways, but sadly did not save the older versions to show. They didn't work, though. I've done something similar in the past to see how many "open cases" there were at any given point in time. That inspired the code here and went like this:

SELECT        TOP (1000) tMY.Month_Year, COUNT(*) AS ActiveCases
FROM          tbl_Casework   AS tblCW LEFT OUTER JOIN
                         tbl_Month_Year AS tMY ON tMY.Month_Year >= tblCW.Start_Date AND tMY.Month_Year <= DATEADD(day, 31 - DATEPART(day, 
                         ISNULL(tblCW.End_Date, GETDATE())), ISNULL(tblCW.End_Date, GETDATE()))
GROUP BY tMY.Month_Year

This definitely worked, but was just a count of "how many cases were open at some point during each month?"

SELECT tMY.Month_Year
        ,tPL.[Accommodation Provider]
        ,tSC.[Service_capacity_Total]
-- if started before month began and closed at or after end of month / or still open
        ,(sum(case when (datediff(day, tPL.[Vacancy Filled Date], [tMY].[MonthYear])<0 AND 
                (datediff(day, [tMY].[Month_Year], tPL.[Vacancy End Date])>=30) OR tPL.[Vacancy End Date] is null) then 30 
-- if started after month began and closed during month
        ,sum(case when (datediff(day, tPL.[Vacancy Filled Date], [tMY].[MonthYear])>=0 AND 
                datediff(day, [tMY].[Month_Year], tPL.[Vacancy End Date])<=30) then tPL.[Placement_Length_in_Days]
-- if started before and closed after month - take filled date to end of month
        ,sum(case when datediff(day, [tMY].[Month_Year], tPL.[Vacancy End Date])>=30 AND datediff(day, tPL.[Vacancy Filled Date], [tMY].[Month_Year])<0 then 
        datediff(day, tPL.[Vacancy Filled Date], DATEADD(DAY, 30, tMY.Month_Year)) END) / (tSC.[Service_capacity]*30)*100 As [Occupancy Rate]

  FROM [tbl_Placements] tPL
  inner join tbl_Service_Capacity tSC on tSC.[Service Name] = tPL.[Accommodation Provider]
  left outer join tbl_Month_Year tMY ON tMY.MonthYear >= [Vacancy Filled Date] and tMY.MonthYear <= DATEADD(day, 30, tPL.[Vacancy Filled Date])
  WHERE tPL.[Vacancy Filled Date] >= '20160501' and tMY.MonthYear < (getdate()-30) AND tSC.[Service Capacity] IS NOT NULL
  group by tMY.MonthYear, tPL.[Service Name], tSC.[Service Capacity]--, tPL.[Client Name]
  order by tMY.MonthYear Asc

The code runs but I get crazy occupancy rates at 300% or 3% so the figures must be incorrect. The only part I'm sure of is taking the [Placement_Length_in_Days] when it starts and finishes within the time period. The calculations here are wrong, I'm sure of that.

abpatil
  • 916
  • 16
  • 20
Gizmo191
  • 13
  • 3
  • I'd start with a `m:n` relation between a days table and an accomodation table with a mapping table showing one row for each day, where an accomodation is occupied. You can pre-compute this for a larger periode and narrow this for any given periode. Please add some more rows to your sample data to simulate some edge cases (overlapping etc.) and please provide this as [mcve] with DDL and insert statement. And please provide the expected output. – Shnugo Aug 09 '19 at 11:58
  • Hi Shnugo. Thank you for coming back to me. Sorry that I'm a bit new to this site so I wasn't sure what I should provide. But I will get those things and add them here. – Gizmo191 Aug 09 '19 at 12:12
  • Your edit did not help that much... Did you check my suggestion? It should be enough t count the rows per accommodation within a given periode – Shnugo Aug 09 '19 at 17:06
  • Hi Shnugo, I have just come back to this today, forgive me, I'm going to look at the answer posted below to see if this will work, but I will do the suggested edits if not. Thank you. – Gizmo191 Aug 12 '19 at 14:25

1 Answers1

0

To give you a quick shot, you might try this:

DECLARE @tbl_Placements TABLE
                        (
                            [Provider Name] VARCHAR(100), 
                            [Name of Client] VARCHAR(100), 
                            [Vacancy Filled Date] DATE, 
                            [Vacancy End Date] DATE
                        );

INSERT INTO @tbl_Placements 
VALUES ('Accommodation1', 'John Smith', '2018-08-04', '2018-08-12'),
       ('Accommodation1', 'Jane Smith ', '2019-01-28', '2019-04-09');

SELECT 
    p.[Provider Name], p.[Name of Client],
    DATEADD(DAY, A.Nmbr - 1, p.[Vacancy Filled Date]) AS OccupiedAt
FROM 
    @tbl_Placements p
CROSS APPLY
    (SELECT TOP (DATEDIFF(DAY, p.[Vacancy Filled Date], p.[Vacancy End Date]) + 1)
         ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
     FROM 
         master..spt_values) A(Nmbr);

The idea in short:

  • We use CROSS APPLY to create a joined set per row.
  • We use a computed TOP clause to get the right count of rows back
  • We create a numbers-table on the fly, simply by querying any table with enough rows (here I took master..spt_values. We do not need the actual table's content, just a counter we get from ROW_NUMBER().
  • We return the set together with a running day starting with the first day of occupation and ending with the last day of occupation.

Hint: This was much easier, if you have an existing physical numbers/date table in your database. You would simply inner join this table with a BETWEEN in the ON-clause.
You might read this.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • 1
    Fantastic! I didn't quite understand what this was doing until I ran it and saw the output. The only little bit I had to change was to change the cross apply to SELECT TOP (DATEDIFF(DAY,HPR.[Vacancy Filled Date],isnull(HPR.[Vacancy End Date],getdate())+1)) to account for Null vacancy end dates where "placements" are still open. – Gizmo191 Aug 13 '19 at 12:05
  • @Gizmo191 Glad to help you! As told in a comment above you can pre-compute this to avoid long running queries and get your results simply by counting the rows with in a periode. – Shnugo Aug 13 '19 at 12:27
  • 1
    Yes, you're right. As it's not critical to have realt time results, I think I would set up a view to refresh once a week and pull results from that for any requested time period. – Gizmo191 Aug 14 '19 at 17:03