1

I have a table that has a column Months_since_Start_fin_year and a Date column. I need to add the number of months in the first column to the date in the second column.

DateTable['Date']=DateTable['First_month']+DateTable['Months_since_Start_fin_year'].astype("timedelta64[M]")

This works OK for month 0, but month 1 already has a different time and for month 2 onwards has the wrong date. Image of output table where early months have the correct date but month 2 where I would expect June 1st actually shows May 31st It must be adding incomplete months, but I'm not sure how to fix it?

I have also tried

DateTable['Date']=DateTable['First_month']+relativedelta(months=DateTable['Months_since_Start_fin_year'])

but I get a type error that says

TypeError: cannot convert the series to <class 'int'>

My Months_since_Start_fin_year is type int32 and my First_month variable is datetime64[ns]

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Lotte
  • 13
  • 3
  • 1
    Could you please add sample data as text instead of image? See [mre]: That makes it much easier to reproduce the problem and give you a good answer :) – FObersteiner Apr 07 '21 at 17:37

1 Answers1

0

The problem with adding months as an offset to a date is that not all months are equally long (28-31 days). So you need pd.DateOffset which handles that ambiguity for you. .astype("timedelta64[M]") on the other hand only gives you the average days per month within a year (30 days 10:29:06).

Ex:

import pandas as pd

# a synthetic example since you didn't provide a mre
df = pd.DataFrame({'start_date': 7*['2017-04-01'],
                   'month_offset': range(7)})

# make sure we have datetime dtype
df['start_date'] = pd.to_datetime(df['start_date'])

# add month offset
df['new_date'] = df.apply(lambda row: row['start_date'] + 
                                      pd.DateOffset(months=row['month_offset']),
                                      axis=1)

which would give you e.g.

df
  start_date  month_offset   new_date
0 2017-04-01             0 2017-04-01
1 2017-04-01             1 2017-05-01
2 2017-04-01             2 2017-06-01
3 2017-04-01             3 2017-07-01
4 2017-04-01             4 2017-08-01
5 2017-04-01             5 2017-09-01
6 2017-04-01             6 2017-10-01

You can find similar examples here on SO, e.g. Add months to a date in Pandas. I only modified the answer there by using an apply to be able to take the months offset from one of the DataFrame's columns.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72