4

I recently upgraded my code from Python 3.3 to Python 3.7, and it's currently throwing an error which says:

ValueError: Units 'M' and 'Y' are no longer supported, as they do not represent unambiguous timedelta values durations

Which is puzzling, because the code was working fine before the upgrade.

Here's the offending part of the code:

df['date_modified'] = (df['date_variable']-pd.to_timedelta(df['years_variable'], unit = 'Y')).dt.date

And here is the full code:

import pandas as pd
import numpy as np

idx = [np.array(['Jan-18', 'Jan-18', 'Feb-18', 'Mar-18', 'Mar-18', 'Mar-18','Apr-18', 'Apr-18', 'May-18', 'Jun-18', 'Jun-18', 'Jun-18','Jul-18', 'Aug-18', 'Aug-18', 'Sep-18', 'Sep-18', 'Oct-18','Oct-18', 'Oct-18', 'Nov-18', 'Dec-18', 'Dec-18',]),np.array(['A', 'B', 'B', 'A', 'B', 'C', 'A', 'B', 'B', 'A', 'B', 'C','A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'A', 'B', 'C'])]
data = [{'years_variable': 1}, {'years_variable': 5}, {'years_variable': 3}, {'years_variable': 2}, {'years_variable': 7}, {'years_variable': 3},{'years_variable': 1}, {'years_variable': 6}, {'years_variable': 3}, {'years_variable': 5}, {'years_variable': 2}, {'years_variable': 3},{'years_variable': 1}, {'years_variable': 9}, {'years_variable': 3}, {'years_variable': 2}, {'years_variable': 7}, {'years_variable': 3}, {'years_variable': 6}, {'years_variable': 8}, {'years_variable': 2}, {'years_variable': 7}, {'years_variable': 9}]
df = pd.DataFrame(data, index=idx, columns=['years_variable'])
df.index.names=['date_variable','type']
df=df.reset_index()
df['date_variable'] = pd.to_datetime(df['date_variable'],format = '%b-%y') # http://strftime.org/
df=df.set_index(['date_variable','type'])
df=df.reset_index()
print(df)

df['date_modified'] = (df['date_variable']-pd.to_timedelta(df['years_variable'], unit = 'Y')).dt.date
mfonism
  • 535
  • 6
  • 15
Mario Arend
  • 459
  • 4
  • 16
  • Without the code in question, errors, and etc., there is absolutely nothing that can be done in part by StackOverflow users to help you. – felipe Feb 07 '20 at 23:24
  • Please add the error that you are seeing. – felipe Feb 07 '20 at 23:24
  • can you post some sample data with what you want? – Umar.H Feb 07 '20 at 23:32
  • Sorry again for the bad editing of my question, but I thought I was going to have a command to explain the question when I cliked next, but it didn't, so the question was published without the code in question. I hope you understand. – Mario Arend Feb 07 '20 at 23:33

1 Answers1

8

This is not a Python issue, but something to do with pandas.

As at version 0.25.0, the library pandas dropped support for the use of the units "M"(months) "Y" (year) in timedelta functions.

https://pandas-docs.github.io/pandas-docs-travis/whatsnew/v0.25.0.html#other-deprecations

This specifically affects pandas.to_timedelta(), pandas.Timedelta() and pandas.TimedeltaIndex().

You can specify these with their days equivalent instead.

You'll have to rewrite your code to make use of days instead of years (and months).


Here's a link to the issue on github which led to this deprecation, and here's a link to the PR that resolved the issue and effected the deprecation.


UPDATE: Here's a More Recent Modification of Your Code

This makes less changes to your original code than the block of code after it:

import pandas as pd
import numpy as np

idx = [np.array(['Jan-18', 'Jan-18', 'Feb-18', 'Mar-18', 'Mar-18', 'Mar-18','Apr-18', 'Apr-18', 'May-18', 'Jun-18', 'Jun-18', 'Jun-18','Jul-18', 'Aug-18', 'Aug-18', 'Sep-18', 'Sep-18', 'Oct-18','Oct-18', 'Oct-18', 'Nov-18', 'Dec-18', 'Dec-18',]),np.array(['A', 'B', 'B', 'A', 'B', 'C', 'A', 'B', 'B', 'A', 'B', 'C','A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'A', 'B', 'C'])]
data = [{'years_variable': 1}, {'years_variable': 5}, {'years_variable': 3}, {'years_variable': 2}, {'years_variable': 7}, {'years_variable': 3},{'years_variable': 1}, {'years_variable': 6}, {'years_variable': 3}, {'years_variable': 5}, {'years_variable': 2}, {'years_variable': 3},{'years_variable': 1}, {'years_variable': 9}, {'years_variable': 3}, {'years_variable': 2}, {'years_variable': 7}, {'years_variable': 3}, {'years_variable': 6}, {'years_variable': 8}, {'years_variable': 2}, {'years_variable': 7}, {'years_variable': 9}]
df = pd.DataFrame(data, index=idx, columns=['years_variable'])
df.index.names=['date_variable','type']
df=df.reset_index()
df['date_variable'] = pd.to_datetime(df['date_variable'],format = '%b-%y') # http://strftime.org/
df=df.set_index(['date_variable','type'])
df=df.reset_index()

# this is all we're touching
# multiply the values under the 'years_variable' column by 365
# to get the number of days
# and use the 'D' unit in the timedelta, to indicate that it's actually in days
df['date_modified'] = (df['date_variable']-pd.to_timedelta(df['years_variable']*365, unit = 'D')).dt.date

print(df)

OUTPUT

   date_variable type  years_variable date_modified
0     2018-01-01    A               1    2017-01-01
1     2018-01-01    B               5    2013-01-02
2     2018-02-01    B               3    2015-02-02
3     2018-03-01    A               2    2016-03-01
4     2018-03-01    B               7    2011-03-03
5     2018-03-01    C               3    2015-03-02
6     2018-04-01    A               1    2017-04-01
7     2018-04-01    B               6    2012-04-02
8     2018-05-01    B               3    2015-05-02
9     2018-06-01    A               5    2013-06-02
10    2018-06-01    B               2    2016-06-01
11    2018-06-01    C               3    2015-06-02
12    2018-07-01    A               1    2017-07-01
13    2018-08-01    B               9    2009-08-03
14    2018-08-01    C               3    2015-08-02
15    2018-09-01    A               2    2016-09-01
16    2018-09-01    B               7    2011-09-03
17    2018-10-01    C               3    2015-10-02
18    2018-10-01    A               6    2012-10-02
19    2018-10-01    B               8    2010-10-03
20    2018-11-01    A               2    2016-11-01
21    2018-12-01    B               7    2011-12-03
22    2018-12-01    C               9    2009-12-03

Here's an Older Modification of Your Code, Which May Work for You

You should totally ignore it. I'm just keeping it for historical purposes.

import pandas as pd
import numpy as np

idx = [np.array(['Jan-18', 'Jan-18', 'Feb-18', 'Mar-18', 'Mar-18', 'Mar-18','Apr-18', 'Apr-18', 'May-18', 'Jun-18', 'Jun-18', 'Jun-18','Jul-18', 'Aug-18', 'Aug-18', 'Sep-18', 'Sep-18', 'Oct-18','Oct-18', 'Oct-18', 'Nov-18', 'Dec-18', 'Dec-18',]),np.array(['A', 'B', 'B', 'A', 'B', 'C', 'A', 'B', 'B', 'A', 'B', 'C','A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'A', 'B', 'C'])]

# convert the values of the inner dicts from years to days
# because we'll be specifying 'D' units in the `timedelta` function
# as opposed to the now deprecated 'Y' units which we used previously
data = [{'years_variable': 365}, {'years_variable': 1825}, {'years_variable': 1095}, {'years_variable': 730}, {'years_variable': 2555}, {'years_variable': 1095}, {'years_variable': 365}, {'years_variable': 2190}, {'years_variable': 1095}, {'years_variable': 1825}, {'years_variable': 730}, {'years_variable': 1095}, {'years_variable': 365}, {'years_variable': 3285}, {'years_variable': 1095}, {'years_variable': 730}, {'years_variable': 2555}, {'years_variable': 1095}, {'years_variable': 2190}, {'years_variable': 2920}, {'years_variable': 730}, {'years_variable': 2555}, {'years_variable': 3285}]
df = pd.DataFrame(data, index=idx, columns=['years_variable'])
df.index.names=['date_variable','type']
df=df.reset_index()
df['date_variable'] = pd.to_datetime(df['date_variable'],format = '%b-%y') # http://strftime.org/
df=df.set_index(['date_variable','type'])
df=df.reset_index()

# specify 'D' units in the timedelta function
df['date_modified'] = (df['date_variable']-pd.to_timedelta(df['years_variable'], unit='D')).dt.date

print(df)
mfonism
  • 535
  • 6
  • 15
  • 1
    Great concise answer with explanation and possible solution. – felipe Feb 08 '20 at 00:08
  • I tried df['date_modified'] = (df['date_variable']-pd.to_timedelta(365)).dt.date , but it only substract 1 day. I need to substract 1 year – Mario Arend Feb 08 '20 at 00:19
  • I also tried df['date_modified'] = (df['date_variable']-pd.to_timedelta(years='years_variable')).dt.date , but it doesnt work – Mario Arend Feb 08 '20 at 00:30
  • I don't really know much about data science, but I have a suggestion. Leave your code the way it was previously (at the failing state). Then, go to your `data` variable (defined on the fifth line in the code you put up here) and translate all those values in the inner dicts to their day equivalent. Leave everything else untouched. Convert only the numbers. Then run your code. If it works, then you'll just have to figure out how to rename the column `years_variable` so someone else can understand the code. For example, I might call it, `days_offset`. – mfonism Feb 08 '20 at 00:30
  • After converting the numbers, take out the `unit=Y` parameter in the offending line of the code (the last line in the one you posted here). – mfonism Feb 08 '20 at 00:36
  • 1
    Can you write code, it is very difficult to understand in words – Mario Arend Feb 08 '20 at 00:54