5

I'm confused by the interoperation between numpy and pandas date objects (or maybe just by numpy's datetime64 in general).

I was trying to count business days using numpy's built-in functionality like so:

np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64("28/03/2016")])

However, numpy apparently can't deal with the inverted date format:

ValueError: Error parsing datetime string "28/03/2016" at position 2

To get around this, I thought I'd just use pandas to_datetime, which can. However:

np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64(pd.to_datetime("28/03/2016"))])

ValueError: Cannot safely convert provided holidays input into an array of dates

Searching around for a bit, it seemed that this was caused by the fact that the chaining of to_datetime and np.datetime64 results in a datetime64[us] object, which apparently the busday_count function cannot accept (is this intended behaviour or a bug?). Thus, my next attempt was:

np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64(pd.Timestamp("28"), "D")])

But:

TypeError: Cannot cast datetime.datetime object from metadata [us] to [D] according to the rule 'same_kind'

And that's me out - why are there so many incompatibilities between all these datetime formats? And how can I get around them?

Nils Gudat
  • 13,222
  • 3
  • 39
  • 60
  • 1
    Since, for the first example, the other date arguments are passed in with `YYYY-MM-DD` format, why don't you just convert the date string in the first example to `2016-03-28`? e.g. `'-'.join("28/03/2016".split('/')[::-1])` – mhawke Mar 18 '16 at 10:14
  • That's one way of doing it of course - this was just the format that the holiday calendar in my real life application came in, and it triggered all of the other questions. So my post above is more about understanding what is actually going on - thanks anyway for a pragmatic solution! – Nils Gudat Mar 18 '16 at 10:19
  • Possible duplicate of [Converting between datetime, Timestamp and datetime64](http://stackoverflow.com/questions/13703720/converting-between-datetime-timestamp-and-datetime64) – ivan_pozdeev Mar 18 '16 at 10:24
  • Not an explanation, but here's another way: `pd.to_datetime("28/03/2016").strftime('%F')` – mhawke Mar 18 '16 at 10:34

2 Answers2

2

I've been having a similar issue, using np.is_busday()

The type of datetime64 is vital to get right. Checking the numpy datetime docs, you can specify the numpy datetime type to be D.

This works:

my_holidays=np.array([datetime.datetime.strptime(x,'%m/%d/%y') for x in holidays.Date.values], dtype='datetime64[D]')

day_flags['business_day'] = np.is_busday(days,holidays=my_holidays)

Whereas this throws the same error you got:

my_holidays=np.array([datetime.datetime.strptime(x,'%m/%d/%y') for x in holidays.Date.values], dtype='datetime64')

The only difference is specifying the type of datetime64.

dtype='datetime64[D]'

vs

dtype='datetime64'

Docs are here:

https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.datetime.html

bearcub
  • 313
  • 4
  • 11
0

I had the same issue while using np.busday_count, later I figured out the problem was with the hours, minutes, seconds, and milliseconds getting added while converting it to datetime object or numpy datetime object.

I just converted to datetime object with only date and not the hours, minutes, seconds, and milliseconds.

The following was my code:

holidays_list.json file:

{
    "holidays_2019": [
        "04-Mar-2019",
        "21-Mar-2019",
        "17-Apr-2019",
        "19-Apr-2019",
        "29-Apr-2019",
        "01-May-2019",
        "05-Jun-2019",
        "12-Aug-2019",
        "15-Aug-2019",
        "02-Sep-2019",
        "10-Sep-2019",
        "02-Oct-2019",
        "08-Oct-2019",
        "28-Oct-2019",
        "12-Nov-2019",
        "25-Dec-2019"
    ],
    "format": "%d-%b-%Y"
}

code file:

import json
import datetime
import numpy as np

with open('holidays_list.json', 'r') as infile:
    data = json.loads(infile.read())

# the following is where I convert the datetime object to date
holidays = list(map(lambda x: datetime.datetime.strptime(
    x, data['format']).date(), data['holidays_2019']))

start_date = datetime.datetime.today().date()
end_date = start_date + datetime.timedelta(days=30)
holidays = [start_date + datetime.timedelta(days=1)]
print(np.busday_count(start_date, end_date, holidays=holidays))
Jeril
  • 7,858
  • 3
  • 52
  • 69