1

I am writing a Python script using datetime, holidays and dateutil to determine if a given date in YYYY-MM-DD format is a trading holiday. I'm using a generator expression to remove holidays where the market is not closed from the default list of holidays provided by the holidays library,

import datetime, holidays
import dateutil.easter as easter

def to_date(date_string):
    return datetime.datetime.strptime(date_string,'%Y-%m-%d').date()

def is_trading_holiday(date):
    us_holidays = holidays.UnitedStates(years=date.year)
    # generate list without columbus day and veterans day since markets are open on those days
    trading_holidays = [ "Columbus Day", "Columbus Day (Observed)", "Veterans Day", "Veterans Day (Observed)"]
    custom_holidays = [ date for date in us_holidays if us_holidays[date] not in trading_holidays ]
    # add good friday to list since markets are closed on good friday
    custom_holidays.append(easter.easter(year=date.year) - datetime.timedelta(days=2))

    return date in custom_holidays

if __name__=="__main__":
    first_date = to_date('2020-01-03')
    second_date = to_date('2015-11-26') # Thanksgiving
    third_date = to_date('2005-01-01') # New Years
    fourth_date = to_date('2005-01-07')

    print(is_trading_holiday(first_date))
    print(is_trading_holiday(second_date))
    print(is_trading_holiday(third_date))
    print(is_trading_holiday(fourth_date))

I've tested this for a variety of dates and it seems to work in all cases but one. When I use dates from the year 2005, this function blows up and tells me,

Traceback (most recent call last):
  File "./test.py", line 26, in <module>
    print(is_trading_holiday(third_date))
  File "./test.py", line 11, in is_trading_holiday
    custom_holidays = [ date for date in us_holidays if us_holidays[date] not in trading_holidays ]
  File "./test.py", line 11, in <listcomp>
    custom_holidays = [ date for date in us_holidays if us_holidays[date] not in trading_holidays ]
RuntimeError: dictionary changed size during iteration

I have no idea what is special about 2005 that makes this function blow up, or even if the year is what is causing this problem (I have tested this for dates going back to the seventies, and it works). I am not modifying the dictionary I am iterating over in the generator expression (or else, I don't think am?), so I'm not sure what this error is trying to tell me.

Anyone know what is going on here? Am I missing something obvious?

Grant Moore
  • 153
  • 1
  • 10
  • Does it change anything if you change the comprehension to `[date for date, holiday in us_holidays.items() if holiday in trading_holidays]`? – Paul Oct 16 '21 at 16:06
  • 1
    It seems that my version works (trying it in a colab on mobile), so my guess is that `__getitem__` can sometimes have the side effect of modifying the underlying dictionary. In 2005 "New Years Day (observed)" takes place in 2004, but comes *after* New Years Day in the iterator, and that's the entry that modifies the dictionary, so it's probably an implementation detail of the library leaking out when holidays come out of order? Probably worth reporting it as a bug. – Paul Oct 16 '21 at 16:22
  • 1
    Ah interestingly if you use `__getitem__` on every entry in `us_holidays` for 2005, then loop over it again, New Years (observed) doesn't show up! Evidently `__getitem__` is pruning the date from 2004. – Paul Oct 16 '21 at 16:26

1 Answers1

3

There seems to be a bug (or special case) in the UnitedStates class that generates datetime.date(2004, 12, 31): "New Year's Day (Observed)" for 2005. This causes if us_holidays[date] in your list comprehension to reference a different year (that has not been loaded yet) and makes alterations to the dictionary you are traversing.

You can work around that problem by iterating over the items rather than re-accessing the dictionary with the keys:

... for date,name  in us_holidays.items() if name not in trading_holidays]

Alternatively you could just convert to a list so that the iteration doesn't run through the actual dictionary:

... for date in list(us_holidays) if us_holidays[date] not in trading_holidays]
Alain T.
  • 40,517
  • 4
  • 31
  • 51