4

Hi have some dates in datetime.datetime format that I use to filter a panda dataframe with panda timestamp. I just tried the following and get a 2 hour offset :

from datetime import datetime
import pandas as pd
pd.to_datetime(datetime(2020, 5, 11, 0, 0, 0).timestamp()*1e9)

The output is:

->Timestamp('2020-05-10 22:00:00')

Can anybody explain why this gives a 2 hour offset? I am in Denmark so it corresponds to the offset to GMT. Is this the reason. I can of course just add 2 hours but want to understand why to make the script robust in the future.

Thanks for your help Jesper

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Mr. O
  • 43
  • 1
  • 3

2 Answers2

3

pd.to_datetime accepts a datetime object so you could just do (pandas assumes UTC):

pd.to_datetime(datetime(2020, 5, 11))

You are getting a 2 hour offset when converting to a timestamp because by default python's datetime is unaware of timezone and will give you a "naive" datetime object (docs are here: https://docs.python.org/3/library/datetime.html#aware-and-naive-objects). The generated timestamp will be in the local timezone, hence the 2 hour offset.

You can pass in a tzinfo parameter to the datetime object specifying that the time should be treated as UTC:

from datetime import datetime
import pandas as pd
import pytz

pd.to_datetime(datetime(2020, 5, 11, 0, 0, 0, tzinfo=pytz.UTC).timestamp()*1e9)

Alternatively, you can generate a UTC timestamp using the calendar module:

from datetime import datetime
import pandas as pd
import calendar

timestamp = calendar.timegm(datetime(2020, 5, 11, 0, 0, 0).utctimetuple())
pd.to_datetime(timestamp*1e9)
Matti John
  • 19,329
  • 7
  • 41
  • 39
1

if your datetime objects actually represent local time (i.e. your OS setting), you can simply use

from datetime import datetime
import pandas as pd

t = pd.to_datetime(datetime(2020, 5, 11).astimezone())
# e.g. I'm on CEST, so t is
# Timestamp('2020-05-11 00:00:00+0200', tz='Mitteleuropäische Sommerzeit')

see: How do I get a value of datetime.today() in Python that is “timezone aware”?


Just keep in mind that pandas will treat naive Python datetime objects as if they were UTC:

from datetime import timezone

t1 = pd.to_datetime(datetime(2020, 5, 11, tzinfo=timezone.utc))
t2 = pd.to_datetime(datetime(2020, 5, 11))

t1.timestamp() == t2.timestamp()
# True

see also: Python datetime and pandas give different timestamps for the same date

FObersteiner
  • 22,500
  • 8
  • 42
  • 72