1

I've parsed a number of date times and have converted them into different time zones. However, when trying to write the data frame to an excel file with df.to_excel I receive the following error, "Timestamp subtraction must have the same timezones or no timezones".

Looking at different threads I've tried using tz_localize(None) and tz_convert to convert the date times to UTC and then write the file but that hasn't worked.

#C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\Scripts>
import pandas as pd
from pandas import Timestamp
import pytz
from pytz import all_timezones
import datetime
import xlrd
import xlwt

data = pd.read_excel('lab.xls')
data = data.drop_duplicates('Site UP')
data = data.drop(data[data.Duration == 0].index)
data.to_excel('no_duplicates_no_zeroes.xls')
data['Site DOWN'] = pd.to_datetime(data['Site DOWN'])
data['Site UP'] = pd.to_datetime(data['Site UP'])

def conversion_function(x: pd.Series) -> pd.Timestamp:  
    zones = {'Atlantic': 'Canada/Atlantic',
             'Central': 'US/Central',
             'Eastern': "US/Eastern",
             'Mountain': 'US/Mountain',
             'Pacific': 'US/Pacific'}
    raw_time = pd.Timestamp(x[1])
    loc_raw_time = raw_time.tz_localize("US/Pacific")
    return loc_raw_time.tz_convert(zones[x[0]])

data['Adjusted_Down'] = data[['Time_Zone', 'Site DOWN']].apply(conversion_function, axis=1)
data['Adjusted_Up'] = data[['Time_Zone', 'Site UP']].apply(conversion_function, axis=1)

data.to_excel('no_duplicates_no_zeroes.xls')

Here's a dataframe example (before the dates are converted)

Region  Time_Zone   Site DOWN       Site UP         Duration
US      Mountain    07/22/19 01:19  07/22/19 01:23  4
Canada  Central     07/22/19 01:19  07/22/19 01:23  4
US      Eastern     07/21/19 22:50  07/21/19 22:52  2
US      Eastern     07/16/19 11:26  07/16/19 11:29  3
US      Pacific     07/25/19 16:47  07/25/19 16:50  3
US      Pacific     07/22/19 13:13  07/22/19 13:15  2
US      Pacific     07/22/19 01:20  07/22/19 01:22  2
US      Central     06/30/19 11:56  06/30/19 11:58  2
US      Mountain    07/29/19 03:14  07/29/19 03:16  2
US      Mountain    07/22/19 01:19  07/22/19 01:23  4
Canada  Atlantic    07/22/19 01:19  07/22/19 01:21  2
Canada  Eastern     07/03/19 06:51  07/03/19 11:34  283
US      Eastern     07/21/19 16:51  07/21/19 16:53  2
US      Eastern     07/21/19 13:43  07/21/19 16:31  168
Canada  Atlantic    07/22/19 01:19  07/22/19 01:21  2
US      Mountain    07/18/19 01:30  07/18/19 01:58  28
US      Central     07/22/19 01:20  07/22/19 01:22  2
Canada  Central     07/17/19 22:17  07/17/19 22:21  4
Canada  Central     07/15/19 06:14  07/15/19 08:42  148
Canada  Mountain    07/22/19 01:19  07/22/19 01:23  4
Canada  Mountain    07/22/19 01:18  07/22/19 01:21  3
Canada  Central     07/22/19 01:20  07/22/19 01:22  2
Canada  Central     07/17/19 09:26  07/17/19 09:28  2
Canada  Atlantic    07/30/19 18:18  07/31/19 04:44  626
Canada  Atlantic    07/29/19 21:20  07/29/19 21:22  2
Canada  Atlantic    07/25/19 03:41  07/25/19 03:43  2
Canada  Atlantic    07/22/19 01:20  07/22/19 01:23  3
Canada  Atlantic    07/21/19 22:50  07/21/19 22:50  0
Canada  Eastern     07/24/19 01:57  07/24/19 03:55  118

Consequently, here's the error message when trying to save the data frame to an excel sheet.

Traceback (most recent call last):
  File "<pyshell#15>", line 1, in <module>
    data.to_excel('final.xls', 'a+')
  File "C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\generic.py", line 2127, in to_excel
    engine=engine)
  File "C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\formats\excel.py", line 662, in write
    freeze_panes=freeze_panes)
  File "C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 1720, in write_cells
    val, style)
  File "C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\lib\site-packages\xlwt\Worksheet.py", line 1088, in write
    self.row(r).write(c, label, style)
  File "C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\lib\site-packages\xlwt\Row.py", line 244, in write
    date_number = self.__excel_date_dt(label)
  File "C:\Users\cgarrido\AppData\Local\Programs\Python\Python37\lib\site-packages\xlwt\Row.py", line 99, in __excel_date_dt
    delta = date - epoch
  File "pandas\_libs\tslibs\timestamps.pyx", line 413, in pandas._libs.tslibs.timestamps._Timestamp.__sub__
TypeError: Timestamp subtraction must have the same timezones or no timezones
cgarrido
  • 77
  • 5
  • 1
    I do not have an immediate answer, but it would help to allow reproducing the error - for that one needs a portion of `data` dataframe - you can provide the code to creat it, or for example `data.to_dict()` result. This would be similar to output you provided as a printout, but will allow to recreate the error and tackle it. – Evgeny Aug 07 '19 at 21:21

1 Answers1

1

Try to switch those arguments for tz_localize and tz_convert - you should first get a timezone-aware time presented with the actual timezone and then convert it to "US/Pacific":

raw_time = pd.Timestamp(x[1])
loc_raw_time = raw_time.tz_localize(zones[x[0]])
return loc_raw_time.tz_convert("US/Pacific").replace(tzinfo=None)
ipaleka
  • 3,745
  • 2
  • 13
  • 33
  • 1
    Need to know from @cgarrido what the original time refers to - all times US/Pacific or all times recorded at local time zone? – Evgeny Aug 07 '19 at 21:28
  • Thanks for the reply, gave it a go but still get the same "Timestamp subtraction must have the same timezones or no timezones" error. Each site's original time is recorded in US/Pacific and gets converted to local time. – cgarrido Aug 07 '19 at 21:32
  • 1
    I googled and found out that Excel doesn't accept datetime values with timezone included, try with my edited answer. – ipaleka Aug 07 '19 at 21:45