2

I need to count the number of subscribers whose effective dates precede the end dates of months and whose termination dates proceed the first date months within a given timeframe. I then need to group those results in a report formatted by year and month. If the subscriber became a subscriber in 2/12/2019 and is a subscriber until 10/12/2019, the output needs to reflect a count of 1 for each month that the subscriber would have been effective and a count of zero where the subscription was not effective. I need to count where each condition is met, not just where the first condition would be met like how CASE does.

I use SAS and have tried doing this using CASE WHEN manually creating year month groupings that I wanted to use to aggregate subscribers that would meet each WHEN condition. I suspect this is not going to accomplish what I am hoping for as CASE terminates once the first condition is met, eliminating the point of this exercise. I need to count each year_month period that a subscriber is effective. With CASE, just the first month of effective subscription is counted and the rest of the duration of the subscription is dropped from the count because of how CASE works.

    create table subscriber_testing as
        select distinct
            case    
                when    sub_eff_date < '01Feb2018'd and sub_term_date >= '01Jan2018'd then '201801'
                when    sub_eff_date < '01Mar2018'd and sub_term_date >= '01Feb2018'd then '201802'
                when    sub_eff_date < '01Apr2018'd and sub_term_date >= '01Mar2018'd then '201803'
                when    sub_eff_date < '01May2018'd and sub_term_date >= '01Apr2018'd then '201804'
                when    sub_eff_date < '01Jun2018'd and sub_term_date >= '01May2018'd then '201805'
                when    sub_eff_date < '01Jul2018'd and sub_term_date >= '01Jun2018'd then '201806'
                when    sub_eff_date < '01Aug2018'd and sub_term_date >= '01Jul2018'd then '201807'
                when    sub_eff_date < '01Sep2018'd and sub_term_date >= '01Aug2018'd then '201808'
                when    sub_eff_date < '01Oct2018'd and sub_term_date >= '01Sep2018'd then '201809'
                when    sub_eff_date < '01Nov2018'd and sub_term_date >= '01Oct2018'd then '201810'
                when    sub_eff_date < '01Dec2018'd and sub_term_date >= '01Nov2018'd then '201811'
                when    sub_eff_date < '01Jan2019'd and sub_term_date >= '01Dec2018'd then '201812'
                when    sub_eff_date < '01Feb2019'd and sub_term_date >= '01Jan2019'd then '201901'
                when    sub_eff_date < '01Mar2019'd and sub_term_date >= '01Feb2019'd then '201902'
                when    sub_eff_date < '01Apr2019'd and sub_term_date >= '01Mar2019'd then '201903'
                else "n/a"
            end 
        as year_month,
            count(distinct subscriber_ID) as subscriber_count
        from
            prod.subscriber_detail      where
            subscriber_group like '%product_tx_%'
            and
            sub_term_date >= '01jan2018'd
            and
            sub_eff_date <= '31mar2019'd
;
quit;

If I were to query for the one subscriber, the result should produce this desired output:


—————————————————————————
column1_year_month
201801
201802
201803
201804
201805
201806
201807
201808
201809
201810
201811
201812


column2_subscriber_count
0
1
1
1
1
1
1
1
1
0
0
—————————————————————————

As you can see from the result set, I also need to report zeroes for the months where there are no subscribers.

The data columns are stored in one table as follows:
Name, Type, Length, Format
Product, Character, 80, 80
Subscriber_ID, Character, 20, 20
Eff_Date, Date, 8, DATETIME22.3
Term_Date, Date, 8, DATETIME22.3

The COUNTIF function in Excel handles this fine, but I have yet to find a way to accomplish this strictly in a DBMS. I am hoping to find a solution that is DBMS-agnostic. Unfortunately, a CASE expression would terminate once a condition is met. What I need is something that counts every condition that is met and allows me to group those counts by the monthly period that they match.

Any help is greatly appreciated.

1 Answers1

0
data date_months;
infile cards dlm=',' dsd;
input date1 :DATE9. date2 :DATE9.;
cards;
01Jan2018,01Feb2018
01Feb2018,01Mar2018
01Mar2018,01Apr2018
01Apr2018,01May2018
01May2018,01Jun2018
01Jun2018,01Jul2018
01Jul2018,01Aug2018
01Aug2018,01Sep2018
01Sep2018,01Oct2018
01Oct2018,01Nov2018
01Nov2018,01Dec2018
01Dec2018,01Jan2019
01Jan2019,01Feb2019
01Feb2019,01Mar2019
01Mar2019,01Apr2019
01Apr2019,01May2019
;
RUN;

PROC SQL;
create table subscriber_testing as
        select distinct
             a.Date1 as Year_Month,
            count(distinct subscriber_ID) as subscriber_count
        from
                date_months a
                left join prod.subscriber_detail b ON b.sub_eff_date < a.Date2 AND b.sub_term_date >= a.Date1
        where
            subscriber_group like '%product_tx_%'
            and
            sub_term_date >= '01jan2018'd
            and
            sub_eff_date <= '31mar2019'd
;
quit;

Disclaimer: it has been a long time since I used SAS.

If you have a Calendar table handy in your database, maybe create a table of the 1st day of the month (date1) and add 1 month to that date (date2).

  • Hi @Stats_Help_Wanted. Thanks for sharing that solution. Unfortunately, it did not produce anything in the result set. I got one row in the result set with the value "." in the Year_Month column and an empty value in the subscriber_count column. I do follow your logic, but whether using the generated date_months DATA step or creating a calendar table, I was unable to produce the desired output with this solution. I have created a calendar table, but the way that SAS processes dates is frustrating. However, that has not prevented me from being able to count or sum when using date-based logic. –  May 01 '19 at 20:46