I have a dataframe with timezone aware index
>>> dfn.index
Out[1]:
DatetimeIndex(['2004-01-02 01:00:00+11:00', '2004-01-02 02:00:00+11:00',
'2004-01-02 03:00:00+11:00', '2004-01-02 04:00:00+11:00',
'2004-01-02 21:00:00+11:00', '2004-01-02 22:00:00+11:00'],
dtype='datetime64[ns]', freq='H', tz='Australia/Sydney')
I save it in csv, then read it as follows:
>>> dfn.to_csv('temp.csv')
>>> df= pd.read_csv('temp.csv', index_col=0 ,header=None )
>>> df.head()
Out[1]:
1
0
NaN 0.0000
2004-01-02 01:00:00+11:00 0.7519
2004-01-02 02:00:00+11:00 0.7520
2004-01-02 03:00:00+11:00 0.7515
2004-01-02 04:00:00+11:00 0.7502
The index is read as a string
>>> df.index[1]
Out[3]: '2004-01-02 01:00:00+11:00'
On converting to_datetime, it changes the time as it adds +11 to hours
>>> df.index = pd.to_datetime(df.index)
>>> df.index[1]
Out[6]: Timestamp('2004-01-01 14:00:00')
I can now subtract 11 hours from the index to fix it, but is there a better way to handle this?
I tried using the solution in answer here, but that slows down the code a lot.