0

I want to substract dates ignoring specific days (not only weekends but some other days, so Count number of days between dates, ignoring weekends doesn't work in that case).

Given a table referencing whether a day should count or not, how could I implement a substract function that ignore those days ?

data = {'Date':['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04'], 'key':[0, 0, 1, 0]}
df = pd.DataFrame(data)

If key == 1, then the day should not count (as a weekend day in the similar problem)

from datetime import datetime

date1 = datetime.strptime('2019-01-02 21:00:00', '%Y-%m-%d %H:%M:%S')
date2 = datetime.strptime('2019-01-04 17:00:00', '%Y-%m-%d %H:%M:%S')

date2-date1
Out[50]:
datetime.timedelta(days=1, seconds=72000)

Expected output : substract_function(date2,date1) should return 72000 seconds

cicero
  • 508
  • 3
  • 15
  • 1
    I'd opt to make a function that subtracts the two dates and then further subtracts the number of dates you want to exclude that fall in the range [date1, date2). – chris.mclennon Jan 09 '20 at 11:30

1 Answers1

1

I hope this is what you are looking for.

import pandas as pd
from datetime import datetime, timedelta

df = pd.DataFrame({
    'Date':['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04'],
    'key':[0, 0, 1, 0]
    })

# transform df Dates to datetime type
df['Date'] = pd.to_datetime(df['Date']) 

# create list of the dates to be excluded
excl_dates_list = df.query('key==1').Date.to_list() 

# input dates to count the range
date1 = datetime.strptime('2019-01-02 21:00:00', '%Y-%m-%d %H:%M:%S')
date2 = datetime.strptime('2019-01-04 17:00:00', '%Y-%m-%d %H:%M:%S')

# the substraction function:
def substract_function(date2,date1):
    # if the first day is to be excluded, change it to the next day and start from 00:00 (so that 0 seconds are captured for this first day)
    if date1.date() in excl_dates_list:
        date1 = date1.replace(hour=0, minute=0, second=0) + timedelta(days=1)

    # if the last day is to be excluded, change it so it starts from 00:00 (so that 0 seconds are captured for this last day)
    if date2.date() in excl_dates_list:
        date2 = date2.replace(hour=0, minute=0, second=0)

    # count how many whole days between start and end should be excluded
    whole_days_to_exclude = len([x for x in excl_dates_list if (x>date1.date()) & (x<date2.date())])

    # return the substraction minus the number of days to be excluded
    return (date2-date1) - timedelta(days=whole_days_to_exclude)

substract_function(date2,date1)
Neo
  • 627
  • 3
  • 7