1

My dataframe has two columns. When I subtract them to get the month in between, I got some weird numbers. Here is an example:

test = pd.DataFrame({'reg_date': [datetime(2017,3,1), datetime(2016,9,1)], 
                 'leave_date':[datetime(2017,7,1), datetime(2017,6,1)]})
test['diff_month'] = test.leave_date.dt.month - test.reg_date.dt.month
test

The output:

enter image description here

If a user's register_date is last year, I get a negative number (also incorrect as well).

What operations should I perform to get the correct time difference in month between two datetime column?


Update: I changed the example a bit so it reflects more about the issue I am facing. Don't down vote so fast guys.

A hack I did to fix this is:

test['real_diff'] = test.diff_month.apply(lambda x: x if x > 0 else 12+x)

I don't like the hack so I am curious if there is any other way of doing it.

Cheng
  • 16,824
  • 23
  • 74
  • 104
  • b.month = 5, a.month = 12 --> 5 - 12 = -7 .. Maybe try `a.month` - `b.month` – user3053452 Jul 20 '17 at 09:09
  • Possible duplicate of [Python: Difference of 2 datetimes in months](https://stackoverflow.com/questions/7015587/python-difference-of-2-datetimes-in-months) – perigon Jul 20 '17 at 09:11
  • Same example here : https://stackoverflow.com/questions/41199351/month-subtract-month – Clément Jul 20 '17 at 09:11

2 Answers2

2

To get your result you can use relativedelta from dateutil:

import datetime
from dateutil import relativedelta

a = datetime.datetime(2016, 12, 1)
b = datetime.datetime(2017, 5, 1)

relativedelta.relativedelta(b, a).months
#5
zipa
  • 27,316
  • 6
  • 40
  • 58
2

IIUC you can call apply and use relativedelta as @zipa suggested:

In[29]:
from dateutil import relativedelta
test['real_diff'] = test.apply(lambda row: relativedelta.relativedelta(row['leave_date'], row['reg_date']).months, axis=1)
test

Out[29]: 
  leave_date   reg_date  real_diff
0 2017-07-01 2017-03-01          4
1 2017-06-01 2016-09-01          9
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Note this wont't work as expected if the difference is 12 months or more. For that you'll need to do use the `relativedelta.relativedelta.years` attribute as well. – sundance Apr 17 '18 at 18:54