6

I'm importing data into pandas and want to remove any timezones – if they're present in the data. If the data has a time zone, the following code works successfully:

col = "my_date_column"
df[col] = pd.to_datetime(df[col]).dt.tz_localize(None) # We don't want timezones...

If the data does not contain a timezone, I'd like to use the following code:

df[col] = pd.to_datetime(df[col])

My issue is that I'm not sure how to test for timezone in the datetime object / series.

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Yaakov Bressler
  • 9,056
  • 2
  • 45
  • 69
  • You can apply `.dt.tz_localize(None)` to all columns if there is no timezone it doesn't change this column. – V. Ayrat Jul 06 '20 at 04:19
  • are you sure you want to localize to None? you could also convert to None, see [here](https://stackoverflow.com/a/62656878/10197418) – FObersteiner Jul 06 '20 at 06:03

2 Answers2

8

Assuming you have a column of type datetime, you can check the tzinfo of each timestamp in the column. It's basically described here (although this is not specific to pytz). Ex:

import pandas as pd

# example series:
s = pd.Series([
        pd.Timestamp("2020-06-06").tz_localize("Europe/Berlin"), # tzinfo defined
        pd.Timestamp("2020-06-07") # tzinfo is None
        ])

# s
# 0    2020-06-06 00:00:00+02:00
# 1          2020-06-07 00:00:00
# dtype: object
  
# now find a mask which is True where the timestamp has a timezone:
has_tz = s.apply(lambda t: t.tzinfo is not None)

# has_tz
# 0     True
# 1    False
# dtype: bool
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
5

This builds upon the prior answer by FObersteiner.

If the column is of type datetime64[ns], use Series.dt.tz:

col.dt.tz is None

If the column is of type object of pd.Timestamp, it doesn't support .dt, so use Timestamp.tz instead:

col.apply(lambda t: t.tz is None).all()
FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Asclepius
  • 57,944
  • 17
  • 167
  • 143