I am ignoring the warnings and trying to subclass a pandas DataFrame. My reasons for doing so are as follows:
- I want to retain all the existing methods of
DataFrame
. - I want to set a few additional attributes at class instantiation, which will later be used to define additional methods that I can call on the subclass.
Here's a snippet:
class SubFrame(pd.DataFrame):
def __init__(self, *args, **kwargs):
freq = kwargs.pop('freq', None)
ddof = kwargs.pop('ddof', None)
super(SubFrame, self).__init__(*args, **kwargs)
self.freq = freq
self.ddof = ddof
self.index.freq = pd.tseries.frequencies.to_offset(self.freq)
@property
def _constructor(self):
return SubFrame
Here's a use example. Say I have the DataFrame
print(df)
col0 col1 col2
2014-07-31 0.28393 1.84587 -1.37899
2014-08-31 5.71914 2.19755 3.97959
2014-09-30 -3.16015 -7.47063 -1.40869
2014-10-31 5.08850 1.14998 2.43273
2014-11-30 1.89474 -1.08953 2.67830
where the index has no frequency
print(df.index)
DatetimeIndex(['2014-07-31', '2014-08-31', '2014-09-30', '2014-10-31',
'2014-11-30'],
dtype='datetime64[ns]', freq=None)
Using SubFrame
allows me to specify that frequency in one step:
sf = SubFrame(df, freq='M')
print(sf.index)
DatetimeIndex(['2014-07-31', '2014-08-31', '2014-09-30', '2014-10-31',
'2014-11-30'],
dtype='datetime64[ns]', freq='M')
The issue is, this modifies df
:
print(df.index.freq)
<MonthEnd>
What's going on here, and how can I avoid this?
Moreover, I profess to using copied code that I don't understand all that well. What is happening within __init__
above? Is it necessary to use args/kwargs with pop
here? (Why can't I just specify params as usual?)