Using Python pandas, I have been attempting to use a function, as one of a few replacement values for a pandas.DataFrame
(i.e. one of the replacements should itself be the result of a function call). My understanding is that pandas.DataFrame.replace
delegates internally to re.sub
and that anything that works with it should also work with pandas.DataFrame.replace
, provided that the regex
parameter is set to True
.
Accordingly, I followed the guidance provided elsewhere on stackoverflow, but pertaining to re.sub
, and attempted to apply it to pandas.DataFrame.replace
(using replace with regex=True, inplace=True
and with to_replace
set as either a nested dictionary, if specifying a specific column, or otherwise as two lists, per its documentation). My code works fine without using a function call, but fails if I try to provide a function as one of the replacement values, despite doing so in the same manner as re.sub
(which was tested, and worked correctly). I realize that the function is expected to accept a match object as its only required parameter and return a string.
Instead of the resultant DataFrame
having the result of the function call, it contains the function itself (i.e. as a first-class, unparameterized, object).
Why is this occurring and how can I get this to work correctly (return and store the function's result)? If this is not possible, I would appreciate if a viable and "Pandasonic" alternative could be suggested.
I provide an example of this below:
def fn(match):
id = match.group(1)
result = None
with open(file_name, 'r') as file:
for line in file:
if 'string' in line:
result = line.split()[-1]
return (result or id)
data.replace(to_replace={'col1': {'string': fn}},
regex=True, inplace=True)
The above does not work, in that it replaces the right search string, but replaces it with:
<function fn at 0x3ad4398>
For the above (contrived) example, the expected output would be that all values of "string" in col1
are substituted for the string returned from fn
.
However, import re; print(re.sub('string', fn, 'test string'))
, works as expected (and as previously depicted).