0

Data

I have the following data:

data = [['1987-09-01', 5], ['1987-09-01', 2.66], ['1987-09-01', np.nan]]
df = pd.DataFrame(data, columns=['Date', 'year'])
df['Date'] = pd.to_datetime(df['Date'])

Goal

To subtract the number of years from the date. For np.nan, I do not want any value to be subtracted.

Attempt

My attempt is as follows:

df['Date'] - pd.to_timedelta(df.year.astype(str), units = 'Y')

Which leads to the following error:

ValueError: no units specified

I know that the number of years is not supported in pd.to_timedelta. I was wondering how I can accomplish my goal in another way?

  • 2
    Think you could probably do (untested): `df['Date'] - pd.to_timedelta(df.year.fillna(0), unit='y')` ? (note that it's `unit` (singular) - not plural) – Jon Clements Dec 04 '19 at 15:22
  • Does this answer your question? [Subtract an year from a datetime column in pandas](https://stackoverflow.com/questions/31169774/subtract-an-year-from-a-datetime-column-in-pandas) – eva-vw Dec 04 '19 at 15:28
  • `Y` and `M` are being deprecated because they do not represent a fixed amount of time. To achieve what you want, you should consider what a year represent to you. - If, in your understanding, "a year" means "365 days", just multiply your delta by 365 and use `day` as unit (in this case, you may want to use 1 year = 365.25 days, to consider leap years). - If 1 year variation means the same day in the next/previous year, just add/subtract the integer part of the delta from the year field and decide how to use the remaining decimal part of the delta (ask yourself what 0.66 of a year means). – Diego Queiroz Dec 04 '19 at 17:58
  • However, although `Y` is deprecated, it still works, as @JonClements said. – Diego Queiroz Dec 04 '19 at 18:01

1 Answers1

2

pd.DateOffset should work for you

df['Date'] = pd.to_datetime(df['Date'])
df['Date'] = df['Date'] - pd.DateOffset(years=1)
eva-vw
  • 650
  • 4
  • 11