python method chaining on methods from a class

Question

I have a created a class like below which takes a pandas data frame and returns aggregate of it and sample of it. I can call each of those methods separately but I am unable to chain them like df.columns.to_list(). How can I make it work?

import pandas as pd    
iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')




class MyClass:

    def __init__(self, df):
        self.df=df
    
    def return_agg(self):
        non_num=self.df.select_dtypes(exclude='number').columns.to_list()
        self.df=self.df.groupby(non_num,dropna=False).sum().reset_index()
        return self.df

    def return_sample(self):
        self.sample=self.df.sample(frac=0.1, replace=True, random_state=1)
        return self.sample
    

a = MyClass(iris)
a.return_sample() #works
a.return_agg() #works
a.return_sample().return_agg() #doesnot work

After making the change as suggested by various friends below, the method chaining works but the result is not expected.

a = MyClass(iris)
df1=a.return_agg().df
df2=a.return_sample().return_agg().df
df1
[44]:
species sepal_length    sepal_width petal_length    petal_width
0   setosa  250.3   171.4   73.1    12.3
1   versicolor  296.8   138.5   213.0   66.3
2   virginica   329.4   148.7   277.6   101.3
[45]:

df2
[45]:
species sepal_length    sepal_width petal_length    petal_width
0   setosa  250.3   171.4   73.1    12.3
1   versicolor  296.8   138.5   213.0   66.3
2   virginica   329.4   148.7   277.6   101.3

df2 should be different from df1 because it is aggregating on sample.

because `a.return_sample()` doesn't return the object instance it returns `self.sample` — A l w a y s S u n n y, Sep 11 '22 at 12:21

ThePyGuy · Answer 1 · 2022-09-11T12:31:34.400

In your current implementation, self.sample is an instance of pandas.DataFrame and not MyClass, and since pandas Dataframe doesn't have return_sample method, its obvious that it'll return an error. If you create an instance of MyClass to store self.sample, the provided function calls should work as expected

class MyClass:

    def __init__(self, df):
        self.df = df

    def return_agg(self):
        non_num = self.df.select_dtypes(exclude='number').columns.to_list()
        self.df = self.df.groupby(non_num, dropna=False).sum().reset_index()
        return self.df

    def return_sample(self):
        # Instance of MyClass                <----
        self.sample = MyClass(self.df.sample(frac=0.1, 
                                             replace=True, 
                                             random_state=1)
                              )
        return self.sample

Talking about chaining that you've mentioned in the question, columns in df.columns.to_list() is an attribute and is an instance of another class which implementes to_list()

python method chaining on methods from a class

1 Answers1