I would like to get the last business day (LBD) of the month, and use LBD to filter records in a dataframe, I did come up with python code. But to achieve this functionality I need to use UDF. Is there any way to get the last business day of the month without using PySpark UDF?
import calendar
def last_business_day_in_month(calendarYearMonth):
year = int(calendarYearMonth[0:4])
month = int(calendarYearMonth[4:])
return str(year) + str(month) + str(max(calendar.monthcalendar(year, month)[-1:][0][:5]))
last_business_day_in_month(calendarYearMonth)
calendarYearMonth
is in format YYYYMM