57

I have a datetime column as below -

>>> df['ACC_DATE'].head(2)
538   2006-04-07
550   2006-04-12
Name: ACC_DATE, dtype: datetime64[ns]

Now, I want to subtract an year from each row of this column. How can I achieve the same & which library can I use?

The expected field -

        ACC_DATE    NEW_DATE
538   2006-04-07  2005-04-07
549   2006-04-12  2005-04-12
Asclepius
  • 57,944
  • 17
  • 167
  • 143
0nir
  • 1,345
  • 4
  • 20
  • 41

4 Answers4

107

You can use DateOffset to achieve this:

In[88]:
df['NEW_DATE'] = df['ACC_DATE'] - pd.DateOffset(years=1)
df

Out[88]: 
        ACC_DATE   NEW_DATE
index                      
538   2006-04-07 2005-04-07
550   2006-04-12 2005-04-12
Max Ghenis
  • 14,783
  • 16
  • 84
  • 132
EdChum
  • 376,765
  • 198
  • 813
  • 562
18

Use DateOffset:

df["NEW_DATE"] = df["ACC_DATE"] - pd.offsets.DateOffset(years=1)
print (df)
        ACC_DATE   NEW_DATE
index                      
538   2006-04-07 2005-04-07
550   2006-04-12 2005-04-12
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • May i ask what's the difference between `pd.DateOffset(years=1)` and yours `pd.offsets.DateOffset(years=1)`? – ah bon Oct 28 '21 at 07:21
  • @ahbon if use `plural years` added 1 year, if use `singular year` set to year `1` – jezrael Oct 28 '21 at 07:22
14

You could use pd.Timedelta:

df["NEW_DATE"] = df["ACC_DATE"] - pd.Timedelta(days=365) 

Or replace:

df["NEW_DATE"] = df["ACC_DATE"].apply(lambda x: x.replace(year=x.year - 1))

But neither will catch leap years so you could use dateutil.relativedelta :

from dateutil.relativedelta import  relativedelta

df["NEW_DATE"] = df["ACC_DATE"].apply(lambda x: x - relativedelta(years=1))
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

If having a single pd.Timestamp object rather than a column,

  1. Using pd.DateOffset(years=n) is not ideal as it produces:

UserWarning: Discarding nonzero nanoseconds in conversion

  1. pd.Timedelta() doesn't accept years.

The only approach that worked for me in this case is pd.Timestamp.replace:

t = pd.Timestamp.now()
t = t.replace(year=t.year - n)

This was hinted at in the answer by Padriac but it needed further clarity.

Asclepius
  • 57,944
  • 17
  • 167
  • 143