3

I understand why you can't add a year/month timedelta64 to a day, since the month or years may have different number of days. But I expected adding a year to a date to work because all the information necessary is available. Alas I am saddened:

import numpy as np

print(np.datetime64("2015-06-01") + np.timedelta64(1, "Y"))

# TypeError: Cannot get a common metadata divisor for NumPy datetime metadata [D] and [Y] because they have incompatible nonlinear base time units

How do I make it work?

Edit:

The answer to the duplicate question is unsuitable. I'm looking to do it as best as I can, ignoring the corner cases. I'm trying to get nice date ticks, so being inexact is fine.

Something like downcast datetime64[D] to datetime64[M] if I need to.

csiz
  • 4,742
  • 9
  • 33
  • 46
  • Possible duplicate of [numpy datetime64 add or substract date interval](http://stackoverflow.com/questions/22842314/numpy-datetime64-add-or-substract-date-interval) – Francisco Oct 27 '16 at 17:34
  • 1
    Why would you expect this to work? Years aren't a consistent amount of time. What would happen if you tried to add a year to February 29th, 2004? – user2357112 Oct 27 '16 at 17:35
  • Possible duplicate of [Add one year in current date PYTHON](http://stackoverflow.com/questions/15741618/add-one-year-in-current-date-python) – dimo414 Oct 27 '16 at 18:10
  • 1
    @user2357112 I would expect this to work because dates are a human concept, and we can talk coherently about "*one year from today*" even on February 29th. Do you think people born on the 29th don't age at the same rate as everyone else? The abstract concepts of time can be represented in code; one must simply be careful about how the edge cases are handled. That's what date-time libraries are for. – dimo414 Oct 27 '16 at 18:16

2 Answers2

4

This simply will not work, at least not with numpy alone.

Days, hours, minutes, seconds can all be converted because they have compatible base units. There is always 60 seconds in a minute, always 60 minutes in an hour, always 24 hours in a day.

Years and Months are treated specially, because how much time they represent changes depending on when they are used. While a timedelta day unit is equivalent to 24 hours, there is no way to convert a month unit into days, because different months have different numbers of days. By extension, there is no way to convert years into days either.

In order to implement this appropriately, you will need to decide how to resolve conflicts such as leap years. This is not something that can be done with numpy alone. The way arithmetic works with numpy.datetime64 objects is different from other libraries and, as mentioned in the documents, is not possible to convert between days and months.

Ordinary datetime and relativedeltas would work, because these libraries have codified the behavior on such conflicts.

from dateutil.relativedelta import relativedelta
from datetime import datetime
datetime(2016, 2, 29) + relativedelta(years=1)
#datetime.datetime(2017, 2, 28, 0, 0)

So, if you like how these datetime libraries sort it out... Something like this would get you the result...

from datetime import datetime
from dateutil.relativedelta import relativedelta
import numpy as np
def fuzzy_add(npdt, years):
    year, month, day = str(npdt).split("-")
    d = datetime(int(year), int(month), int(day))
    delta = relativedelta(years=years)
    the_date = d + delta
    new_npdt = np.datetime64(the_date.isoformat()[:10])
    return new_npdt

Example:

fuzzy_add(np.datetime64("2016-02-29"), 1)
#numpy.datetime64('2017-02-28')
sytech
  • 29,298
  • 3
  • 45
  • 86
  • Yeah, but how do I make it work? I can't find in the documentation a way to tell downcast from days to months.. – csiz Oct 27 '16 at 17:45
  • @csiz My point was that it's not *posisble* to do, period. If you wanted to make the number of days in a year static, you could do `np.datetime64("2015-06-01") + np.timedelta64(366, "D")` but that probably won't give you any kind of desirable or consistent outcome, because of the issues with number of days in months and years. – sytech Oct 27 '16 at 17:51
  • But it is possible. If I say a year from now on 2015-06-01 the other person would usually understand 2016-06-01 even though it's a leap year. I'm trying to replicate what people understand by a year from now. (And I'm close to a solution). – csiz Oct 27 '16 at 17:55
  • What happens if the date that existed in one year is not present in another? What should the behavior be? If you defined how to resolve those kinds of conflicts, you could probably make a solution that way, but not with just `datetime64` and `timedelta64` objects alone. By nature, Timedeltas can't work in terms of both months and days. The way `np.datetime64` objects do arithmetic with timedeltas, it's not possible. For illustration `(2016-2-29)` is a valid datetime, but `(2017-2-29)` is not. Should that be `(2017-2-28)` or `(2017-3-1)`? This is the ambiguity problem that makes this impossible – sytech Oct 27 '16 at 18:01
  • @sytech you're saying one *literally cannot* take a date like `2015-06-01` and conceptualize that one year later is `2016-06-01`? That's obviously false, and date-time libraries *exist* to make such operations possible. Java's [`LocalDate`](http://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html#plusYears-long-) class, for instance, makes this trivial: `LocalDate.of(2015, 6, 1).plusYears(1)`. – dimo414 Oct 27 '16 at 18:09
  • @dimo414 Can you tell me tell me then what "One year from `2016-02-29`" is? I'm saying it's not possible with `np.datetime64` and `np.timedelta64` objects alone. – sytech Oct 27 '16 at 18:11
  • That's up to the date-time library to decide. In the case of Java's time package they went with "*For example, 2008-02-29 (leap year) plus one year would result in the invalid date 2009-02-29 (standard year). Instead of returning an invalid result, the last valid day of the month, 2009-02-28, is selected instead.*" It is *possible* to do arithmetic operations on dates and times, even if the results are not always perfect - dates aren't perfect. – dimo414 Oct 27 '16 at 18:14
  • @dimo414 *"That's up to the date-time library to decide."* Exactly. I mentioned in a previous comment, if you can define how those conflicts should resolve, you can do it. But it's not possible using `np.datetime64` and `np.timedelta64` objects. Pandas, for example, has tools for resolving those conflicts. All of the scalar types in numpy are derived from `numpy.timedelta64` – sytech Oct 27 '16 at 18:16
  • Not until your previous comment did you clarify that you were referring only to `np.datetime64`; before that, and in your answer, you acted as if this is conceptually incoherent and impossible to support. If you want to say in your answer that it's a limitation of *this library*, rather than implying that it's impossible in principle, I'd be happy to agree with you. – dimo414 Oct 27 '16 at 18:19
  • @dimo414 You're right, I should adjust my answer to include that information. But I did mention this in my second comment reply to OP. – sytech Oct 27 '16 at 18:22
  • Also, your claim that "*There is always 60 seconds in a minute*" is [incorrect](https://en.wikipedia.org/wiki/Leap_second) - if this library assumes otherwise that's a problem. It's difficult to make *any* assumptions about how dates and times interact, as they're likely [all wrong](http://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time). – dimo414 Oct 27 '16 at 18:22
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/126856/discussion-between-sytech-and-dimo414). – sytech Oct 27 '16 at 18:24
  • @csiz I've made some edits to my answer. Maybe this is more helpful to you now. – sytech Oct 27 '16 at 19:24
0

Maybe the simplest solution is to use string operations.

import numpy as np

def add_years(date, n_years):
    y, m, d = str(date).split("-")
    return np.datetime64("{}-{}-{}".format(int(y)+n_years, m, d))

my_date = np.datetime64("2015-06-01")
new_date = add_years(my_date, n_years=1)
print new_date
Akavall
  • 82,592
  • 51
  • 207
  • 251
  • using `d = np.datetime64("2016-02-29")` then `nd = add_years(d, n_years=1)` results in `ValueError: Day out of range in datetime string "2017-02-29"` – sytech Oct 27 '16 at 18:10
  • this might work if the string approach is used to make an ordinary `datetime` object representing the same start date, then adding to a `relativedelta` object, then creating a new `numpy.datetime64` object based on the resulting `datetime` object. But the overhead of doing that may subvert the purpose of using numpy to begin with. – sytech Oct 27 '16 at 19:16