You could use rgettattr
to get attributes from the Series, testframe[col]
:
For example,
In [74]: s = pd.Series(['1','2'])
In [75]: rgetattr(s, 'str.replace')('1', 'A')
Out[75]:
0 A
1 2
dtype: object
import functools
import pandas as pd
def rgetattr(obj, attr, *args):
def _getattr(obj, attr):
return getattr(obj, attr, *args)
return functools.reduce(_getattr, [obj] + attr.split('.'))
testframe = pd.DataFrame.from_dict({'col1': [1, 2], 'col2': [3, 4]})
funcdict = {'col1': ['astype', 'str.replace'],
'col2': ['astype', 'str.replace']}
argdict = {'col1': [['str'], ['1', 'A']], 'col2': [['str'], ['3', 'B']]}
for col in testframe.columns:
for attr, args in zip(funcdict[col], argdict[col]):
testframe[col] = rgetattr(testframe[col], attr)(*args)
print(testframe)
yields
col1 col2
0 A B
1 2 4
getattr
is the function in Python's standard library used for getting a named attribute from an object when the name is given in the form of a string. For example, given
In [92]: s = pd.Series(['1','2']); s
Out[92]:
0 1
1 2
dtype: object
we can obtain s.str
using
In [85]: getattr(s, 'str')
Out[85]: <pandas.core.strings.StringMethods at 0x7f334a847208>
In [91]: s.str == getattr(s, 'str')
Out[91]: True
To obtain s.str.replace
, we would need
In [88]: getattr(getattr(s, 'str'), 'replace')
Out[88]: <bound method StringMethods.replace of <pandas.core.strings.StringMethods object at 0x7f334a847208>>
In [90]: s.str.replace == getattr(getattr(s, 'str'), 'replace')
Out[90]: True
However, if we specify
funcdict = {'col1': ['astype', 'str.replace'],
'col2': ['astype', 'str.replace']}
then we need some way of handling cases where we need one call to getattr
, (e.g. getattr(testframe[col], 'astype')
) versus those cases where we need multiple calls to getattr
(e.g. getattr(getattr(testframe[col], 'str'), 'replace')
.
To unify the two cases into one simple syntax, we can use rgetattr
, a recursive drop-in replacement for getattr
which can handle dotted chains of string attribute names such as 'str.replace'
.
The recursion is handled by reduce
.
The docs give as an example that reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
calculates ((((1+2)+3)+4)+5)
. Similarly, you can imagine the +
being replaced by getattr
so that rgetattr(s, 'str.replace')
calculates getattr(getattr(s, 'str'), 'replace')
.