0

How would I efficiently convert local times in a dataframe to UTC times? There are 3 columns with information: the date (string), the timezone code (string), and the hour of the day (integer).

date timezone hour
7/31/2010 0:00:00 EST 1
6/14/2010 0:00:00 PST 3
6/14/2010 0:00:00 PST 4
5/30/2010 0:00:00 EDT 23
5/30/2010 0:00:00 EDT 24

After the data is converted I will be aggregating it to monthly data.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
trpmgo3
  • 49
  • 4
  • Does this answer your question? [How to specify time zone information when reading a csv with Pandas](https://stackoverflow.com/questions/68941786/how-to-specify-time-zone-information-when-reading-a-csv-with-pandas) – FObersteiner Nov 23 '22 at 05:45

1 Answers1

1

Gday.

Working with dates is described reasonably well in this answer here: converting utc to est time in python

In that case they have the timezone offsets as numbers e.g +11:00. You have the US short code. So you could convert that column to the numerical equivalent first and then use that function.

Personally I find the notation "Australia/Melbourne" way easier to deal with - especially because it thinks about daylight savings etc for you. Timezones are a nightmare. Thats described here: Python: datetime tzinfo time zone names documentation

In terms of the hour column, you can just use a string function to join those two values together to form a date and time string.

So I'd suggest you convert that timezone column to that format (I.e EST as America/New York), etc, then feed all three columns into a datetime convert line per the first answer

  • 1
    +1 for suggesting [IANA names](https://en.m.wikipedia.org/wiki/List_of_tz_database_time_zones). Abbreviated tz names is one thing that can make tz handling painful. They're ambiguous. Not something you want to teach a computer. – FObersteiner Nov 23 '22 at 06:02