2

I am trying to create an SQL UDF in bigQuery to calculate week in month. I got the result that I'm expecting but my function looks super messy.

create or replace function internal.week_in_month(my_date TIMESTAMP)
returns FLOAT64 as
(

case when 
        (case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7 
                          else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) > 1 then -- check first day of month to decide if it's a complete week (starts on Monday)
                case when EXTRACT(DAY FROM my_date) <= 7 then -- for incomplete week
                    case when 
                            (case when EXTRACT(DAYOFWEEK FROM my_date) = 1 then 7 else EXTRACT(DAYOFWEEK FROM my_date)-1 end)  -  EXTRACT(DAY FROM my_date) =
                                 (case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7 
                                   else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) -1 then 1 -- incomplete week 1
                      else FLOOR(( EXTRACT(DAY FROM my_date) + (case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7
                                                                 else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) -2 )/7)+1 end  -- calculate week based on date
            else FLOOR(( EXTRACT(DAY FROM my_date) + (case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7 
                                                         else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) -2 )/7)+1 end -- calculate week based on date
     else FLOOR((EXTRACT(DAY FROM my_date)-1)/7)+1 -- for complete week
        end
)

Is there a way to store a variable in bigQuery so my code can be less confusing to look? So far I've been reading bQ documentation and haven't found a way to store a variable inside function (for SQL UDF)

Any help would be appreciated, thanks!

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230

1 Answers1

0

Agree - it looks ugly!
So, I see you are reusing below expression eight(8) times

EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day))         

Consider below approach to refactor your initial code to use that expression only once into ABC alias/field and than reuse it in a "variable" fashion

create or replace function internal.week_in_month(my_date TIMESTAMP)
returns FLOAT64 as
((
select case 
  when (case when ABC = 1 then 7 else ABC -1 end) > 1 then -- check first day of month to decide if it's a complete week (starts on Monday)
    case when EXTRACT(DAY FROM my_date) <= 7 then -- for incomplete week
      case when 
       (case when EXTRACT(DAYOFWEEK FROM my_date) = 1 then 7 else EXTRACT(DAYOFWEEK FROM my_date)-1 end)  -  EXTRACT(DAY FROM my_date) =
          (case when ABC = 1 then 7 
            else ABC -1 end) -1 then 1 -- incomplete week 1
          else FLOOR(( EXTRACT(DAY FROM my_date) + (case when ABC = 1 then 7
        else ABC -1 end) -2 )/7)+1 end  -- calculate week based on date
      else FLOOR(( EXTRACT(DAY FROM my_date) + (case when ABC = 1 then 7 
    else ABC -1 end) -2 )/7)+1 end -- calculate week based on date
  else FLOOR((EXTRACT(DAY FROM my_date)-1)/7)+1 -- for complete week
end
from unnest([struct(EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) as ABC)])
));

We can go further and apply same approach to EXTRACT(DAY FROM my_date) which is used six(6) times

create or replace function internal.week_in_month(my_date TIMESTAMP)
returns FLOAT64 as
((
select case 
  when (case when ABC = 1 then 7 else ABC -1 end) > 1 then -- check first day of month to decide if it's a complete week (starts on Monday)
    case when XYZ <= 7 then -- for incomplete week
      case when 
       (case when EXTRACT(DAYOFWEEK FROM my_date) = 1 then 7 else EXTRACT(DAYOFWEEK FROM my_date)-1 end)  -  XYZ =
          (case when ABC = 1 then 7 
            else ABC -1 end) -1 then 1 -- incomplete week 1
          else FLOOR(( XYZ + (case when ABC = 1 then 7
        else ABC -1 end) -2 )/7)+1 end  -- calculate week based on date
      else FLOOR(( XYZ + (case when ABC = 1 then 7 
    else ABC -1 end) -2 )/7)+1 end -- calculate week based on date
  else FLOOR((XYZ-1)/7)+1 -- for complete week
end
from unnest([struct(EXTRACT(DAY FROM my_date) as XYZ)]),
  unnest([struct(EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(XYZ)+1 day)) as ABC)])
));       

In above example - I was lazy enough and used just XYZ and ABC - in hope that you will use appropriate naming which will make our final code readable enough to be happy with

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230