0

So, what I'm trying to achieve is to collect some pandas functions in a dict object in which a key is a column name (string) and the value is a list (or any other collection) of functions. I then want to be able to dispatch those functions on a correlated column from a particular df.

I tried doing something like

 dispatcher = {"ABC": pd.isnull}

but after trying to run the value of this key-value pair on a df I got AttributeError: 'Series' object has no attribute 'x'.

Is something like this achievable?

I also saw this thread but it didn't help as functions stored in there weren't used on a object (dataframe).

@EDIT: I'm looking for something that would allow me to store functions like that:

dispatcher = {
              "ABC": [func1, func2, func3],
              "DEF": [func4, func5]
             }

and then, when working on some example_df, dispatch those functions on correlated columns. So it would cast functions like:

example_df["ABC"].func1()
example_df["ABC"].func2()
example_df["ABC"].func3()
example_df["DEF"].func4()
example_df["DEF"].func5()
Jakub Sapko
  • 304
  • 2
  • 15

1 Answers1

0

A possible solution (using a toy dataframe as an example):

text = """
a   b
NaN 2
1   3
"""
df = pd.read_csv(StringIO(text), sep='\s+')

dispatcher = {"ABC": pd.isnull}
dispatcher.get('ABC')(df)

Output:

       a      b
0   True  False
1  False  False

In case we want to specify the column where pd.isnull should be applied:

from functools import partial

def dfnull(col, df):
    return pd.isnull(df[col])

dispatcher = {"ABC": partial(dfnull, 'a')}
dispatcher.get('ABC')(df)

EDIT 1

The following code should answer the OP comment:

def createf(x):
    return "partial(" + x[0].__name__ + ", '" + x[1] +"')"

cols = ['a', 'b']
funcs = [dfnull, dfnull]

d = dict(zip(cols, map(lambda x: eval(createf(x)), zip(funcs, cols))))

d.get('a')(df)
d.get('b')(df)

Output:

>>> d.get('a')(df)
0     True
1    False
Name: a, dtype: bool
>>> d.get('b')(df)
0    False
1    False
Name: b, dtype: bool

EDIT 2

The following should answer the second comment of the OP:

myfuncs = {'func1': pd.isnull, 'func2': pd.isna}

class MySeries(pd.Series):
    def __init__(self, df):
        super().__init__(df)
        
        for key in myfuncs:
            func = partial(myfuncs[key], df)
            setattr(self, key, func)

MySeries(df['a']).func1()
MySeries(df['a']).func2()
MySeries(df['b']).func2()
MySeries(df['b']).func1()

Output:

>>> MySeries(df['a']).func1()
0     True
1    False
Name: a, dtype: bool
>>> MySeries(df['a']).func2()
0     True
1    False
Name: a, dtype: bool
>>> MySeries(df['b']).func2()
0    False
1    False
Name: b, dtype: bool
>>> MySeries(df['b']).func1()
0    False
1    False
Name: b, dtype: bool

To fire all, just use the following:

mydict = {'a': {'func1': pd.isnull, 'func2': pd.isna},
          'b': {'func1': pd.isnull, 'func2': pd.isna}}

def fire_all(col):
    myfuncs = mydict[col]
    
    class MySeries(pd.Series):
        def __init__(self, df):
            super().__init__(df)
            for key in myfuncs:
                func = partial(myfuncs[key], df)
                setattr(self, key, func)
                
    for f in myfuncs:
        print(eval("MySeries(df[col])." + f + "()"))


fire_all('b')
PaulS
  • 21,159
  • 2
  • 9
  • 26
  • This doesn't answer my question enough, as I specified I'm looking to store the functions in a list (or any other collection) as a value and a column name as a key (on which the collected functions will be dispatched) – Jakub Sapko Nov 02 '22 at 08:51
  • I feel like we're still missing each other with the idea. Please check my edit for the post, hopefully this example will be more clear – Jakub Sapko Nov 02 '22 at 13:17
  • I guess the second edit of my answer does what you want, @JakubSapko. If not, maybe somebody else can help you. – PaulS Nov 02 '22 at 18:19