3

With the following code, I get a two hour difference after converting back to np.datetime64.

How can I avoid this? (if this should be a topic: I am presently in Central Europe)

import pandas as pd
import numpy as np
import datetime

a = np.datetime64('2018-04-01T15:30:00').astype("float")
a
b = np.datetime64(datetime.datetime.fromtimestamp(a))
b

Out[18]: numpy.datetime64('2018-04-01T17:30:00.000000')
user7468395
  • 1,299
  • 2
  • 10
  • 23

3 Answers3

7

The problem is not in the np.datetime64 conversion, but in datetime.datetime.fromtimestamp.

Since Numpy 1.11, np.datetime64 is timezone naive. It no longer assumes that input is in local time, nor does it print local times.

However, datetime.datetime.fromtimestamp does assume local time. From the docs:

Return the local date and time corresponding to the POSIX timestamp, such as is returned by time.time(). If optional argument tz is None or not specified, the timestamp is converted to the platform’s local date and time, and the returned datetime object is naive.

You can use datetime.datetime.utcfromtimestamp instead:

>>> a = np.datetime64('2018-04-01T15:30:00').astype("float")
>>> np.datetime64(datetime.datetime.utcfromtimestamp(a))
numpy.datetime64('2018-04-01T15:30:00.000000')
Robbe
  • 2,610
  • 1
  • 20
  • 31
1

https://github.com/numpy/numpy/issues/3290

As of 1.7, datetime64 attempts to handle timezones by:

  • Assuming all datetime64 objects are in UTC
  • Applying timezone offsets when parsing ISO 8601 strings
  • Applying the Locale timezone offset when the ISO string does not specify a TZ.
  • Applying the Locale timezone offset when printing, etc.

https://stackoverflow.com/a/18817656/7583612

classmethod datetime.fromtimestamp(timestamp, tz=None)

Return the local date and time corresponding to the POSIX timestamp, such as is returned by time.time(). If optional argument tz is None or not specified, the timestamp is converted to the platform’s local date and time, and the returned datetime object is naive.

Else tz must be an instance of a class tzinfo subclass, and the timestamp is converted to tz‘s time zone. In this case the result is equivalent to tz.fromutc(datetime.utcfromtimestamp(timestamp).replace(tzinfo=tz))

naivepredictor
  • 898
  • 4
  • 14
1

Referring back to some of my notes, I found the following:

import numpy
dt64 = numpy.datetime64( "2011-11-11 14:23:56" )

# dt64 is internally just some sort of int
#      it has no fields, and very little support in numpy

import datetime, time
dtdt = dt64.astype(datetime.datetime)         # <<<<<<<< use this!
dtdt.year
dtdt.month
dtdt.day

# to convert back:
dt64 = np.datetime64(dtdt)                    # <<<<<<<< use this too!
dt64.item().strftime("%Y%b%d")

The modules datetime and time are normal python modules: they work reasonably well, have lots of fields, conversions, and support.

datetime64 is an incompletely implemented subtype built into numpy. It's just some sort of 64-bit int (?) (seconds since 1970 perhaps?). datetime64 is something completely different from a datetime.datetime . If you convert a datetime64 to a float and back, you are losing lots of precision (bits) -- hence the errors.

The (not part of numpy) module datetime can also do things like:

# timedelta()
delta = datetime.timedelta(days=11, hours=10, minutes=9, seconds=8)

delta                   # datetime.timedelta(11, 36548)     # (days,seconds)
delta.days
delta.seconds
delta.microseconds
delta.total_seconds()   # 986948.0

# arithmetic: +-*/
#   2 timedelta's
#   timedelta and datetime
now = datetime.datetime.now()
christmas = datetime.datetime(2019,12,25)
delta = christmas - now

So let numpy sometimes store your date-data as datetime64, but I would recommend the not-numpy module datetime to work on datetime-arithmetic.