4

I have a DateFrame with a DateTimeIndex, i.e.

import pandas as pd
dates = pd.date_range('2018-04-01', periods=96, freq='15T', tz='Australia/Sydney', name='timestamp')
df = dates.to_frame(index=False)
df.set_index(dates.name, inplace=True)

I want to create a column with an 0/1 indicator column which is 1 during summer time and 0 during winter, but I cannot find the relevant dst / is_dst attribute, i.e. I want something like

df['is_dst'] = df.index.is_dst()

can anyone advise that the correct method / property is. Or Do I need to covert to a different 'datetime' class?

I need something general - i.e. work for any timezone including say 'Australia/Brisbane' which doesn't have daylight savings. I'd prefer not to have to parse out the timezone offset and try and determine if it's summer / winter.

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
David Waterworth
  • 2,214
  • 1
  • 21
  • 41
  • Maybe this can help your case: https://stackoverflow.com/questions/44124436/python-datetime-to-season – Sanchit Kumar Nov 14 '18 at 02:13
  • Thanks @SanchitKumar, that's useful although I prefer the accepted answer as it gives me the datetime information. The answer you link assumes summer / winter transition is start / end of month – David Waterworth Nov 14 '18 at 03:06

2 Answers2

7

It have in pandas

df.index.map(lambda x : x.dst())

After a small change can yield the Boolean

df.index.map(lambda x : int(x.dst().total_seconds()!=0))
Out[104]: 
Int64Index([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            0, 0, 0, 0, 0, 0, 0, 0],
           dtype='int64', name='timestamp')
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 4
    do I really have to use a `lambda` mapping here? shouldn't there be an easier option along the lines of `df['datetime'].dt.is_dst`? – FObersteiner Jun 28 '20 at 18:11
3

I'm guessing that Wen's method may be a bit faster, but heres a way of working with the underlying Python datetime objects with the isdst attribute from datetime.timetuple:

>>> is_dst = [x.timetuple().tm_isdst for x in df.index.to_pydatetime()]
>>> pd.Series(is_dst).head()
0    1
1    1
2    1
3    1
4    1
dtype: int64
>>> pd.Series(is_dst).tail()
91    0
92    0
93    0
94    0
95    0
dtype: int64

Example for a single value:

.timetuple() returns a time.struct_time;

The tm_isdst flag of the result is set according to the dst() method: tzinfo is None or dst() returns None, tm_isdst is set to -1; else if dst() returns a non-zero value, tm_isdst is set to 1; else tm_isdst is set to 0.

>>> df.index[0].to_pydatetime().timetuple()
time.struct_time(tm_year=2018, tm_mon=4, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=6, tm_yday=91, tm_isdst=1)

The constructor will simply check if the date's .dst() attribute is None, nonzero, or some nonzero value:

    def timetuple(self):
        "Return local time tuple compatible with time.localtime()."
        dst = self.dst()
        if dst is None:
            dst = -1
        elif dst:
            dst = 1
        else:
            dst = 0
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235