10

I'm trying to convert all instances of 'GMT' time in a time/date column ('Created_At') in a csv file so that it is all formatted in 'EST'. Please see below:

import pandas as pd
from pandas.tseries.resample import TimeGrouper
from pandas.tseries.offsets import DateOffset
from pandas.tseries.index import DatetimeIndex

cambridge = pd.read_csv('\Users\cgp\Desktop\Tweets.csv')
cambridge['Created_At'] = pd.to_datetime(pd.Series(cambridge['Created_At']))
cambridge.set_index('Created_At', drop=False, inplace=True)
cambridge.index = cambridge.index.tz_localize('GMT').tz_convert('EST')
cambridge.index = cambridge.index - DateOffset(hours = 12)

The error I'm getting is:

cambridge.index = cambridge.index.tz_localize('GMT').tz_convert('EST')

AttributeError: 'Index' object has no attribute 'tz_localize'

I've tried various different things but am stumped as to why the Index object won't recognized the tz_attribute. Thank you so much for your help!

EdChum
  • 376,765
  • 198
  • 813
  • 562
cgp25
  • 335
  • 1
  • 5
  • 15
  • `tz_localize` is not a method available to `Index` types, can your performing the conversion before setting it as the index – EdChum Mar 06 '15 at 16:42
  • Saying that, it is a method available to `DatetimeIndex` so this could be a bug? compare http://pandas.pydata.org/pandas-docs/stable/api.html#index with http://pandas.pydata.org/pandas-docs/stable/api.html#time-date-components – EdChum Mar 06 '15 at 16:44
  • thanks Ed. i'm new to python - if this is a bug, how would I proceed in fixing it? – cgp25 Mar 07 '15 at 00:15
  • 2
    Before trying to localize, check whether it's an Index or a DatetimeIndex; show us a three-line sample of the values you're starting with (preferably in a format we can use as an argument to DataFrame); and see if the simplified version that works for me works for you. – cphlewis Mar 07 '15 at 06:47

2 Answers2

5

Replace

cambridge.set_index('Created_At', drop=False, inplace=True)

with

cambridge.set_index(pd.DatetimeIndex(cambridge['Created_At']), drop=False, inplace=True)
Mostafa Mahmoud
  • 570
  • 5
  • 13
  • hi, thanks for your help. when I try replacing that line I get the following error: File "C:\Python27\lib\site-packages\pandas-0.15.2-py2.7-win-amd64.egg\pandas\tseries\tools.py", line 467, in parse_time_string raise DateParseError(e) pandas.tseries.tools.DateParseError: unknown string format – cgp25 Mar 07 '15 at 00:06
  • @cpg25 can you attach a sample of data in `cambridge['Created_At']`? Does it look like this `Fri Mar 06 09:30:56 +0000 2015`? – Mostafa Mahmoud Mar 07 '15 at 03:05
  • it doesn't, it looks like this: `"created_at" : { "$date" : "2015-03-03T18:44:46.000-0500" } "created_at" : { "$date" : "2015-03-03T18:44:48.000-0500" } "created_at" : { "$date" : "2015-03-03T18:44:54.000-0500" }` – cgp25 Mar 07 '15 at 03:45
  • Are those "created_at" strings in the data you're passing in? Or the dict entries? Because pandas probably didn't automatically convert them, you probably have to clean that up. – cphlewis Mar 07 '15 at 07:00
2

Hmm. Like the other tz_localize current problem, this works fine for me. Does this work for you? I have simplified some of the calls a bit from your example:

df2 =  pd.DataFrame(randn(3, 3), columns=['A', 'B', 'C'])
# randn(3,3) returns nine random numbers in a 3x3 array.
# the columns argument to DataFrame names the 3 columns. 
# no datetimes here! (look at df2 to check)

df2['A'] = pd.to_datetime(df2['A'])
# convert the random numbers to datetimes -- look at df2 again
# if A had values to_datetime couldn't handle, we'd clean up A first

df2.set_index('A',drop=False, inplace=True)
# and use that column as an index for the whole df2;

df2.index  = df2.index.tz_localize('GMT').tz_convert('US/Eastern')
# make it timezone-conscious in GMT and convert that to Eastern

df2.index.tzinfo
<DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>
cphlewis
  • 15,759
  • 4
  • 46
  • 55
  • Hi @cphlewis, thanks for your help. I tried the code you provided and it worked!! Would you mind please commenting it out so I can follow what you did? I'm most confused by the `(randn(3, 3)` part. – cgp25 Mar 07 '15 at 17:46
  • You didn't post your data so I used random values. Commented. – cphlewis Mar 07 '15 at 21:11