0

The code below gets two date points a and b makes a datetime database from it in the function datetime and then computes a function that calculates the range of months the datetime database is between. So if the Initial date is not in the form of x-x-01T00:00:00Z at day 1 and hour, minute at 0 then it will round the month to the closest date that satisfies x-x-01T00:00:00Z for the Initial_date. The initial date must be of an earlier time than the a values if it is not in the form of x-x-01T00:00:00Z. The b values round it off to the greatest value so it will be always equal to b if it is in the form of x-x-01T00:00:00Z or else it will be greater than b. The first 2 outputs should have '2016-02-01 00:00:00+0000' in the output as the b values ('2016-01-04T21:00:00Z')are greater than '2016-01-01 00:00:00+0000'. How would I be able to get the Expected Output below?

import pandas as pd 
from pandas.tseries.offsets import DateOffset

def dates(a, b):
    #Creates 1 minute data range between date_range(a, b)
    datetime = (pd.DataFrame(columns=['NULL'],
                      index=pd.date_range(a, b, freq='1T'))
           .index.strftime('%Y-%m-%dT%H:%M:%SZ')
           .tolist()
    )
    Initial_date = None
    Last_date = None
    
    datetime = pd.to_datetime(datetime)
    #Is the dates format x-x-01T00:00:00Z
    # For the Initial Date 
    if (datetime[0].day != 1) & (datetime[0].hour != 0) & (datetime[0].minute != 0):
        Initial_date = pd.Timestamp(min(datetime)).round('1d') - DateOffset(months=1)
    #For the final Date 
    if (datetime[-1].day != 1) & (datetime[-1].hour != 0) & (datetime[-1].minute != 0):
        Last_date = pd.Timestamp(max(datetime)).round('1d') + DateOffset(months=1)
    
    Initial_date = pd.Timestamp(min(datetime)).round('1d') if Initial_date is None else Initial_date
    Last_date = pd.Timestamp(max(datetime)).round('1d') if Last_date is None else Last_date
    
    #Month Indexes
    Monthly_idxs = pd.date_range(Initial_date, 
                                    Last_date, freq='MS').strftime("%Y-%m-%d %H:%M:%S%z").tolist()
    
    print(Monthly_idxs)
     
dates('2015-10-08T13:40:00Z', '2016-01-04T21:00:00Z')
dates('2015-10-01T00:00:00Z', '2016-01-04T21:00:00Z')
dates('2015-12-01T00:00:00Z', '2016-01-01T00:00:00Z')

Output:

['2015-10-01 00:00:00+0000', '2015-11-01 00:00:00+0000', '2015-12-01 00:00:00+0000', 
'2016-01-01 00:00:00+0000']

['2015-10-01 00:00:00+0000', '2015-11-01 00:00:00+0000', '2015-12-01 00:00:00+0000', 
'2016-01-01 00:00:00+0000']

['2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000']

Expected Output:

['2015-10-01 00:00:00+0000', '2015-11-01 00:00:00+0000', 
'2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000', '2016-02-01 00:00:00+0000']

['2015-10-01 00:00:00+0000', '2015-11-01 00:00:00+0000', 
'2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000', '2016-02-01 00:00:00+0000']

['2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000']
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
bull selcuk
  • 152
  • 8

1 Answers1

0

Here's a refactored version of your function, using pandas only.

def dates(a, b):
    a, b = pd.to_datetime(a), pd.to_datetime(b)
    # check if a is at beginning of month; if not shift to beginning of *that* month
    if a > (a - pd.Timedelta(days=1) * (a.day-1)).floor('d'):
        a = (a + pd.offsets.MonthBegin(-1)).floor('d')
    # check if b is at beginning of month; if not shift to *next* month
    if b > (b - pd.Timedelta(days=1) * (b.day-1)).floor('d'):
        b = (b + pd.offsets.MonthBegin(1)).floor('d')

    return pd.date_range(a, b, freq='MS').strftime("%Y-%m-%d %H:%M:%S%z").tolist()
print(dates('2015-10-08T13:40:00Z', '2016-01-04T21:00:00Z'))
print(dates('2015-10-01T00:00:00Z', '2016-01-04T21:00:00Z'))
print(dates('2015-12-01T00:00:00Z', '2016-01-01T00:00:00Z'))

['2015-10-01 00:00:00+0000', '2015-11-01 00:00:00+0000', '2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000', '2016-02-01 00:00:00+0000']
['2015-10-01 00:00:00+0000', '2015-11-01 00:00:00+0000', '2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000', '2016-02-01 00:00:00+0000']
['2015-12-01 00:00:00+0000', '2016-01-01 00:00:00+0000']

Related: How to floor a date to the first date of that month?

FObersteiner
  • 22,500
  • 8
  • 42
  • 72