I have a created a class like below which takes a pandas data frame and returns aggregate of it and sample of it. I can call each of those methods separately but I am unable to chain them like df.columns.to_list(). How can I make it work?
import pandas as pd
iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
class MyClass:
def __init__(self, df):
self.df=df
def return_agg(self):
non_num=self.df.select_dtypes(exclude='number').columns.to_list()
self.df=self.df.groupby(non_num,dropna=False).sum().reset_index()
return self.df
def return_sample(self):
self.sample=self.df.sample(frac=0.1, replace=True, random_state=1)
return self.sample
a = MyClass(iris)
a.return_sample() #works
a.return_agg() #works
a.return_sample().return_agg() #doesnot work
After making the change as suggested by various friends below, the method chaining works but the result is not expected.
a = MyClass(iris)
df1=a.return_agg().df
df2=a.return_sample().return_agg().df
df1
[44]:
species sepal_length sepal_width petal_length petal_width
0 setosa 250.3 171.4 73.1 12.3
1 versicolor 296.8 138.5 213.0 66.3
2 virginica 329.4 148.7 277.6 101.3
[45]:
df2
[45]:
species sepal_length sepal_width petal_length petal_width
0 setosa 250.3 171.4 73.1 12.3
1 versicolor 296.8 138.5 213.0 66.3
2 virginica 329.4 148.7 277.6 101.3
df2 should be different from df1 because it is aggregating on sample.