I would like to return specific strings from two passed date columns. My code so far:
df_Date = df[df['state'].str.contains('Traded Away') & df['maturity_date']!=0][['state','maturity_date']]
df_Date['maturity_date'] = pd.to_datetime(df_Date['maturity_date'])
df_Date['Today'] = pd.to_datetime('today')
df_Date['Days'] = (df_Date['maturity_date'] - df_Date['Today'])
print(df_Date.head(10))
state maturity_date Today Days
0 Traded Away 2018-03-15 2018-03-19 -4 days
10 Traded Away 2025-06-15 2018-03-19 2645 days
12 Traded Away 2047-03-21 2018-03-19 10594 days
15 Traded Away 2166-03-15 2018-03-19 54052 days
17 Traded Away 2166-12-18 2018-03-19 54330 days
20 Traded Away 2023-05-04 2018-03-19 1872 days
22 Traded Away 2027-11-15 2018-03-19 3528 days
23 Traded Away 2025-03-15 2018-03-19 2553 days
25 Traded Away 2023-01-15 2018-03-19 1763 days
26 Traded Away 2166-05-01 2018-03-19 54099 days
My function to convert the days to strings is as follows and yields the error: TypeError: invalid type comparison when I print the dataframe.
def Risk_Bucket(x):
if x <= 730:
return '< 2YR'
elif (x > 730 and x <= 1825):
return '2YR_5YR'
elif (x > 1825 and x <= 2555):
return '5YR_7YR'
elif (x > 2555 and x <= 3650):
return '7YR_10YR'
elif (x > 3650 and x <= 7300):
return '10YR_20YR'
elif (x > 7300):
return '> 20YR'
else:
return "Check passed Date"
df_Date['Bucket'] = Risk_Bucket(df_Date['Days'])
print(df_Date.head(10))
I assume this is because of the Days columns has string 'days' in it?
How do I make the Days column numeric? Any suggestions to resolve this and improve my code?