1

I want to compare the timezone of a pd.DatetimeIndex with a pytz.timezone to see if the DatetimeIndex has the expected timezone. But the comparison fails, possibly because using the tzinfo argument does not work as explained in this answer.

import pandas as pd
import pytz
import unittest

tzstr = 'Europe/Vienna'
manual_tz = pytz.timezone(tzstr)

timestamp = pd.to_datetime('2020-03-02 07:00:00+01:00').tz_convert(tzstr)

tc = unittest.TestCase()
tc.assertEqual(timestamp.tz, manual_tz)
AssertionError: <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> != <DstTzInfo 'Europe/Vienna' LMT+1:05:00 STD>

How can I check that the timezone of timestamp is as expected? It should be the same as pytz.timezone in my opinion, but the behavior of pytz and/or pandas makes them somehow different.

Alternative question formulation if it is not possible to make a comparison like this: What do I have to look out for, to not come across this problem any more? It did happen to me more than once already. I found a question similar to this, but having to use multiple dates to compare if the offset is always the same does not seem like the best way to.

Simon
  • 495
  • 1
  • 4
  • 18
  • I am finally using `.utcoffset()` on concrete dates. And in general try to avoid non-UTC tz in my code, only add tz when going to UI / plots. – Simon May 04 '22 at 08:39

1 Answers1

1

I think your matter is linked to this answer

The default zone name and offset delivered when pytz creates a timezone object are the earliest ones available for that zone, and sometimes they can seem kind of strange.

This is what I have when I run your code:

AssertionError: <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> != <DstTzInfo 'Europe/Vienna' LMT+1:05:00 STD>

What you can do is this:

import pandas as pd
import pytz
import unittest
import datetime

tzstr = 'Europe/Vienna'
manual_tz = pytz.timezone(tzstr)
pytz_localize = manual_tz.localize(datetime.datetime(2020, 3, 2, 7, 0, 0, 0))


timestamp = pd.to_datetime('2020-03-02 07:00:00+01:00').tz_convert(tzstr)

tc = unittest.TestCase()
tc.assertEqual(timestamp.tz, pytz_localize.tzinfo)

But IHMO, it's weird to have to create two dates for comparing the timezone information.

What do you mean by compare the timezone? What do you want to check?

Edit
About testing timezone and conversions. What I would do is find test cases you want to check. Something I would check is that I correctly handle the DST change, for this I would do something like this:

import pandas as pd

# DST in Austria in 2021.
all_dates = ['2021-03-28 01:59:00+0100', '2021-03-28 03:01:00+0200']
utc_dates = pd.to_datetime(['2021-03-28 00:59:00+00:00', '2021-03-28 01:01:00+00:00']).tz_localize('UTC')

timezoned_timestamps = pd.to_datetime(all_dates, utc=True).tz_convert("Europe/Vienna")

# then we check if it's equal after UTC conversion
pd.testing.assert_index_equal(utc_dates, timezoned_timestamps.tz_convert('UTC'))

The parameter utc=True in the first to_datetime call is mandatory:

However, timezone-aware inputs with mixed time offsets (for example issued from a timezone with daylight savings, such as Europe/Paris) are not successfully converted to a DatetimeIndex. Instead a simple Index containing datetime.datetime objects is returned

From the documentation.

ndclt
  • 2,590
  • 2
  • 12
  • 26
  • I am getting dates from a database, which are sent only with an offset. I process the timestamps and convert to a timezone defined by the user, so that time shifts (DST) are handled correctly. As I had problems with the timezone conversion multiple times already, I want to unittest, if the processed timestamp has the expected / user defined timezone in all cases. – Simon Mar 16 '22 at 13:29
  • 1
    I hope it's okay to copy the output to the question for clarity. – Simon Mar 16 '22 at 13:36
  • Thanks for the explanation of the need. I will edit my answer with some precision about testing this point. – ndclt Mar 16 '22 at 13:40
  • *it's weird to have to create two dates for comparing the timezone information* - I think it's perfectly reasonable you have to set a date here: time zone rules change over time, so UTC offsets depend on the date. – FObersteiner Mar 16 '22 at 21:50
  • regarding your edit / example: I don't think it makes sense to set a time zone here in the first place. If the input just has UTC offsets, just parse everything to datetime with `utc=True` and check if the date/times are equal. – FObersteiner Mar 16 '22 at 21:53