1

I need to create a table that contains records with 1) all 365 days of the year and 2) a counter representing which business day of the month the day is. Non-business days should be represented with a 0. For example:

Date       |  Business Day
2019-10-01    1
2019-10-02    2
2019-10-03    3
2019-10-04    4
2019-10-05    0    // Saturday
2019-10-06    0    // Sunday     
2019-10-07    5   
....
2019-11-01    1
2019-11-02    0    //  Saturday
2019-11-03    0    //  Sunday
2019-11-04    2

So far, I've been able to create a table that contains all dates of the year.

CREATE TABLE ${TMPID}_days_of_the_year 
(
  `theDate` STRING
);


INSERT OVERWRITE TABLE ${TMPID}_days_of_the_year 
select
    dt_set.theDate
  from
  (
  -- last 0~99 months
    select date_sub('2019-12-31', a.s + 10*b.s + 100*c.s) as theDate
    from
    (
      select 0 as s union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
    ) a
    cross join
    (
      select 0 as s union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
    ) b
    cross join
    (
      select 0 as s union all select 1 union all select 2 union all select 3
    ) c
  ) dt_set
  where dt_set.theDate between '2019-01-01' and '2019-12-31'
  order by dt_set.theDate DESC;

And I also have a table that contains all of the weekend days and holidays (this data is loaded from a file, and the date format is YYYY-MM-DD)

CREATE TABLE ${TMPID}_company_holiday 
(
  `holidayDate` STRING
) 
;
LOAD DATA LOCAL INPATH '${FILE}' INTO TABLE ${TMPID}_company_holiday;

My question is.... how do I join these tables together while creating the business day counter column shown as in the sample data above?

TokyoMike
  • 798
  • 4
  • 16

1 Answers1

1

You can use row_number() for the enumeration. This is a little tricky, because it needs to be conditional, but the information you need is provided by a left join:

select dy.*,
       (case when ch.holiday_date is null
             then row_number() over (partition by trunc(dy.date, 'MONTH'), ch.holiday_date
                                     order by dy.date
                                    )
             else 0
        end) as business_day
from days_of_the_year dy left join
     company_holiday ch
     on dy.date = ch.holiday_date;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786