0

I have a column of timestamps with timezone info. It is imported with a SQL query from a database. It is of datatype datetimeoffset(7)

2020-06-04 13:00:00.0000000 +02:00

How do I convert this to a naive date that takes tz info into account, eg.

2020-06-04 15:00:00

Note that the + 02:00 part has been added to the timestamp, not simply stripped away.

The timestamps are stored in a pandas dataframe.

Mike
  • 99
  • 1
  • 10
  • 1
    Note: technical speaking, the first time takes into account time zone, the second not, it is just a 'wall clock" and nobody know the timezone. And you are using timezone incorrectly: you should *substract* timezone offset (or UTC time + offset = local time). pandas should be able to handle timezones (in reality times are stored in the same UTC format, and an additional field has the +2, in case you ask pandas to print times. – Giacomo Catenazzi Aug 07 '20 at 08:14
  • I assume 'timestamps' is a column in a pd.DataFrame (i.e. a pd.Series)? of what dtype ist it? I assume object (string) - is it exactly this format `'07-08-2020 08:00:00 + 02:00'`? – FObersteiner Aug 07 '20 at 09:24
  • Yes, 'TimeStamp' is a column / series in a pandas df. The table is imported with an SQL query from a database in which it has datatype datetimeoffset(7). – Mike Aug 07 '20 at 10:46
  • ok, it seems you can omit the line `df['timestamp'] = df['timestamp'].str.replace(r'(\+|\-)\ ', r'\1')` from my answer ;-) – FObersteiner Aug 07 '20 at 13:00

2 Answers2

1

if I get the question correctly, you want naive local time, depending on the OS time zone setting you run the script on:

import pandas as pd
from tzlocal import get_localzone

# example data...
df = pd.DataFrame({'timestamp': ["07-08-2020 08:00:00 + 02:00"]})

# cast to datetime, in case you haven't already done this
# we need to strip a space first...
df['timestamp'] = df['timestamp'].str.replace(r'(\+|\-)\ ', r'\1')
df['timestamp'] = pd.to_datetime(df['timestamp'])

# df['timestamp']
# 0   2020-07-08 08:00:00+02:00
# Name: timestamp, dtype: datetime64[ns, pytz.FixedOffset(120)]

# now we can convert to local timezone, which will give us aware local time
df['localtime'] = df['timestamp'].dt.tz_convert(get_localzone())

# ...and remove the tzinfo to get naive datetime:
df['localtime'] = df['localtime'].dt.tz_localize(None)
    
# note that my machine is on UTC+2 -->
# df['localtime']
# 0   2020-07-08 08:00:00
# Name: localtime, dtype: datetime64[ns]

...but keep in mind that this will modify the internal timestamps...

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
0

first get the UTC offset, for eg like this: using this pandas function. You can also use pytz module too to get the UTC offset time.

pandas.Timestamp.utcoffset¶

Once you extracted the UTCoffset hour from the datetime: then simply add the using pandas timedelta function: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timedelta.html for example: do like this:

df['time'] = df['time'].dt.tz_convert(None) + pd.Timedelta(2, unit='h')
user96564
  • 1,578
  • 5
  • 24
  • 42
  • For clarification, I am trying to automate conversion of UTC to customer local time, so manually calculating offsets like with + pd.Timedelta(2, unit='h') is not an option in this case. I am looking for a function that converts timezone aware timestamps to timestamps with the timezone information integrated into the time part. – Mike Aug 07 '20 at 08:35
  • why not ? Calculate the UTCoffset using pandas function and save it in for example coumn called offset and then after that simply add using the timedelta function like above. There are quite many ways this can be done. – user96564 Aug 07 '20 at 08:41
  • Okay, but how do I extreact the "2" part from my timestamp 07-08-2020 08:00:00 + 02:00? I assumed that in python pandas or pytz there would be a time-aware function that did the conversion automatically. – Mike Aug 07 '20 at 08:45