2

So I've downloaded a table from SQL, and I'm now performing operations on it. Two of the columns are Start_Date and Maturity_Date, and I'm looking to subtract the two of them and then divide them by another column. I've figured out how to subtract them using this code:

df['START_DATE']=pd.to_datetime(df['START_DATE'])
df['MATURITY_DATE']=pd.to_datetime(df['MATURITY_DATE'])

df['c']=df['MATURITY_DATE']-df['START_DATE']
df['d']=df['c'].div(df['TERM'], axis=0)

However, when I've divided 'c', which is in datetime, by 'Term', which is in int form, my new df['d'] looks like "60 days 4:00:00". I don't want it to be dates, but I want df['d'] to be rounded to the nearest integer and me to be able to perform operations on it as if its an integer. Essentially, how do I turn df['d'] into the nearest integer?

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
throway172
  • 123
  • 2
  • 4
  • 12

1 Answers1

3

You can use pandas.Series.dt.round, specifying you want it to be to the nearest day.

(df['MATURITY_DATE']-df['START_DATE']).div(df.TERM).dt.round('1d')

If you need numbers and not a timedelta use numpy to convert:

(df['MATURITY_DATE']-df['START_DATE']).div(df.TERM).dt.round('1d')/np.timedelta64(1, 'D')

Sample Data

import pandas as pd
import numpy as np

df = pd.DataFrame({'START_DATE': [pd.to_datetime('2017-01-01'), pd.to_datetime('2017-01-01')],
                  'MATURITY_DATE': [pd.to_datetime('2017-01-06 11:00:00'), pd.to_datetime('2017-01-06 13:00:00')],
                  'TERM': [3,2]})

#  START_DATE       MATURITY_DATE  TERM
#0 2017-01-01 2017-01-06 11:00:00     3
#1 2017-01-01 2017-01-06 13:00:00     2

Code:

(df['MATURITY_DATE']-df['START_DATE']).div(df.TERM).dt.round('1d')
#0   2 days
#1   3 days
#dtype: timedelta64[ns]

(df['MATURITY_DATE']-df['START_DATE']).div(df.TERM).dt.round('1d')/np.timedelta64(1, 'D')
#0    2.0
#1    3.0
#dtype: float64
ALollz
  • 57,915
  • 7
  • 66
  • 89
  • I tried both of those `pandas.series.dt.round` in place of `df['START_DATE']=pd.to_datetime(df['START_DATE']) df['MATURITY_DATE']=pd.to_datetime(df['MATURITY_DATE']) df['c']=df['MATURITY_DATE']-df['START_DATE'] df['d']=df['c'].div(df['TERM'], axis=0)` and got "unsupported operand type(s) for -: 'str' and 'str'" – throway172 Jul 09 '18 at 15:44
  • @throway172 can you print `df.dtypes`. It looks like your columns aren't be converted to `datetime64` before you are trying to subtract them – ALollz Jul 09 '18 at 15:49
  • 1
    @throway172 you shouldn't be using my code IN PLACE of that entire chunk of code. You need to keep the first two parts, which convert using `pd.to_datetime`, but then you can replace where you define `df['c']` and `df['d']`with the single line above. So you would have `df['d'] = (df['MATURITY_DATE']-df['START_DATE']).div(df.TERM).dt.round('1d')/np.timedelta64(1, 'D')` – ALollz Jul 09 '18 at 15:51