I am trying to replace some hardcoded SQL queries related to timezone changes with a more dynamic/data-driven Python script. I have a dataset that looks like this spreadsheet below. WEEK_START/DAY/MONTH is the week, day, and month when daylight savings time begins (for example Canberra starts the first Sunday of April while Vienna is the last Sunday of March). The end variables are in the same format and display when it ends.
Here is the issue. I have seen solutions for specific use cases such as this, finding the last Sunday of the month:
current_year=today.year
current_month=today.month
current_day=today.day
month = calendar.monthcalendar(current_year, current_month)
day_of_month = max(month[-1][calendar.SUNDAY], month[-2][calendar.SUNDAY])
print(day_of_month)
31
This tells me that the last day of this month is the 31st. I can adjust the attributes for one given month/scenario, but how would I make a column for each and every row (city) to retrieve each? That is, several cities that change times on different dates? I thought if I could set attributes in day_of_month in an apply function it would work but when I do something like weekday='SUNDAY' it returns an error because of course the string 'SUNDAY' is not the same as SUNDAY the attribute of calendar. My SQL queries are grouped by cities that change on the same day but ideally anyone would be able to edit the CSV that loads the above dataset as needed and then each day the script would run once to see if today is between the start and end of daylight savings. We might have new cities to add in the future. I'm confident in doing that bit but quite lost on how to retrieve the dates for a given year.
My alternate, less resilient, option is to look at the distinct list of potential dates (last Sunday of March, first Sunday of April, etc.), write code to retrieve each one upfront (as in the above snippet above), and assign the dates in that way. I say that this is less resilient because if a city is added that does not fit in an existing group for time changes, the code would need to be altered as well.
So stackoverflow, is there a way to do this in a data driven way in pandas through an apply or something similar? Thanks in advance.