Creating a pandas.DataFrame wrapper that has a method to return a dataframe

Question

I'm trying to create a class that is a wrapper around pandas.DataFrame objects. After I write my own methods for this class, I would like pandas' methods to be also available, but in a way that explicitly tells me/the user that their are from pandas. It would work like this

df = pd.DataFrame(np.random.randn(5,2))
md = myData(df)

a = md.df # returns the original pandas.DataFrame "df" to a (equivalent of a=df
print(md) # prints as myData class
print(md.df) # prints just as print(df) would. Equiv to print(df)

md.mean() # mean as defined in myData class. Returns myData object
md.df.mean() # mean as defined in pandas. Returns dataframe object

md.std() # myData std
md.df.std() # pandas std

So far all I was able to do were unsuccessful attempts. One thing that I really thought it should but doesn't is

import pandas as _pd
class myData(_pd.DataFrame):
    """
    Attempt to create a myData object
    """
    def __init__(self, df, dic):
        df = df.copy()
        print(type(df))
        self.df = df
        self = df

It exits with RuntimeError: maximum recursion depth exceeded while calling a Python object.

EDIT

The following code ends with the same error.

import pandas as _pd
class myData(_pd.DataFrame):
    """
    Attempt to create a myData object
    """
    def __init__(self, df, dic):
        df = df.copy()
        self.dic = dic
        super(myData, self).__init__(df)
        self.df = df

However, if I try

    def __init__(self, df, dic):
        df = df.copy()
        super(myData, self).__init__(df)

Then it works, but the result is a myData object which is really a DataFrame, since every method is already that of DataFrames.

Any idea of what might be wrong with the code or if there is a way to make this better?

You can access the parent class' attributes and methods using `super()` as discussed in [this question](http://stackoverflow.com/q/576169/3991125). — albert, Jul 19 '16 at 16:51

score 1 · Answer 1 · answered Jul 19 '16 at 18:08

You can not use DataFrame as the parent:

class myData(object):
    """
    Attempt to create a myData object
    """
    def __init__(self, df):
        self.df = df.copy()

df = pd.DataFrame(np.random.randn(100, 5), columns=list('ABCDE'))
mdf = myData(df)
mdf.df.describe()

Creating a pandas.DataFrame wrapper that has a method to return a dataframe

1 Answers1