2

I'm trying to convert np.datetime64 to int via int(np.datetime64(...)). Surprisingly sometimes it works and sometimes it doesn't depending on how was datetime created:

a = np.datetime64('2017-09-26T15:20:11.546205184')
int(a)
a = np.datetime64('2017-09-26')
int(a)

will results int:

1506439211546205184
TypeError: int() argument must be a string, a bytes-like object or a number, not 'datetime.date'

Is there a difference in how those dates are stored internally by numpy and then causing error when converting to int?

zlenyk
  • 922
  • 2
  • 7
  • 22
  • 1
    this error actually is not specific to dates; if your date/time string specifies any ***precision less than nanoseconds***, you must ***also specify a [unit](https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units)*** for `int(...)` to work. I think this behavior ensures that the conversion to a serial date/time is unambiguous - if the unit of the input datetime is e.g. *seconds*, the result is *seconds* since the Unix epoch and so on. – FObersteiner Feb 18 '22 at 14:45
  • 1
    *However,* using `.astype(np.int64)` will work in any case, and return Unix time in the precision you specified. E.g. `np.datetime64('2017-09-26T15:20:11.546').astype(np.int64)` will give you milliseconds since the epoch etc. – FObersteiner Feb 18 '22 at 14:53
  • 1
    Related: [How to get unix timestamp from numpy.datetime64](https://stackoverflow.com/q/11865458/10197418) – FObersteiner Mar 02 '22 at 10:07

2 Answers2

3

The difference is whether it include time values, such as hours, minutes, and seconds.

When you try to convert datetime (or np.datetime64) to int (or np.int64), the value will be epoch time, which is a value of seconds from 1970-01-01 00:00:00 (utc).

(See epoch time calculator: https://www.epochconverter.com/)

However, if you try to convert "2017-09-26" to int, it is hard to calculate how many seconds from 1970-01-01 00:00:00 because the value does not include hour, minutes, seconds information and timezone information.

To make it convertable, you have to add time information, as follows:

a = np.datetime64('2017-09-26T00:00:00.000000000')
print(int(a)) # 1506384000000000000 --> This is an epoch time for 2017-09-26 00:00:00

a = np.datetime64('2017-09-26','us').astype(np.int64) # not int, use np.int64
print(a) # 1506384000000000 -> This is also a epoch time for 2017-09-26 00:00:00

In addition, please use astype(np.int64) instead of astype(int) to convert it to exact epoch time when your value is saved as datetime64. If you use int, this will return the number of days from 1970-01-01.

a = np.datetime64('2017-09-26T15:20:11.546205184').astype(int)
print(a) # 1072585728 -> not an epoch time, but days from 1970-01-01

a = np.datetime64('2017-09-26T15:20:11.546205184').astype(np.int64)
print(a) # 1506439211546205184 -> a correct epoch time of 2017-09-26 15:20:11 with miliseconds
  • edited with consideration of @FObersteiner 's comment, Thanks!
Park
  • 2,446
  • 1
  • 16
  • 25
  • 1
    "*it is impossible to calculate*" - well it is possible, but you have to make assumptions about the time and the time zone (which a bare date does not have). Interestingly, `np.datetime64('2017-09-26').astype(np.int64)` gives you the number of days since the Unix epoch – FObersteiner Feb 18 '22 at 14:28
  • @FObersteiner Oh, thank you for the comment. I edited the some wording as you commented. I did not know that it was the number of days. I thought the information was lost by converting `int64` to `int32`. Thank you again :) – Park Feb 18 '22 at 14:36
  • 1
    you can check with `np.datetime64('1970-01-01T00:00') + np.timedelta64(np.datetime64('2017-09-26').astype(np.int64), 'D')` - and I think it is safe to assume that the unit of a date must be 'days', also considering that numpy's datetime assumes UTC by default and doesn't do weird local time stuff like native Python datetime. – FObersteiner Feb 18 '22 at 14:42
  • @FObersteiner Nice tips! – Park Feb 18 '22 at 14:46
1

Try:

a = np.datetime64('2017-09-26','us').astype(np.int64)