I've written what's effectively a parser for a large amount of sequential data chunks, and I need to write a number of functions to analyze the data chunks in various ways. The parser contains some useful functionality for me such as frequency of reading data into (previously-instantiated) objects, conditional filtering of the data, and when to stop reading the file.
I would like to write external analysis functions in separate modules, import the parser, and pass the analysis function into the parser to evaluate at the end of every data chunk read. In general, the analysis functions will require variables modified within the parser itself (i.e. the data chunk that was read), but it may need additional parameters from the module where it's defined.
Here's essentially what I would like to do for the parser:
def parse_chunk(dat_file, dat_obj1, dat_obj2, parse_arg1=None, fun=None, **fargs):
# Process optional arguments to parser...
with open(dat_file,'r') as dat:
# Parse chunk of dat_file based on parse_arg1 and store data in dat_obj1, dat_obj2, etc.
dat_obj1.attr = parsed_data
local_var1 = dat_obj1.some_method()
# Call analysis function passed to parser
if fun != None:
return fun(**fargs)
In another module, I would have something like:
from parsemod import parse_chunk
def main_script():
# Preprocess data from other files
dat_obj1 = ...
dat_obj2 = ...
script_var1 = ...
# Parse data and analyze
result = parse_chunk(dat_file, dat_obj1, dat_obj2, fun=eval_prop,
dat_obj1=None, local_var1=None, foo=script_var1)
def eval_data(dat_obj1, local_var1, foo):
# Analyze data
...
return result
I've looked at similar questions such as this and this, but the issue here is that eval_data()
has arguments which are modified or set in parse()
, and since **fargs
provides a dictionary, the variable names themselves are not in the namespace for parse()
, so they aren't modified prior to calling eval_data()
.
I've thought about modifying the parser to just return all variables after every chunk read and call eval_data()
from main_script()
, but there are too many different possible variables needed for the different eval_data()
functional forms, so this gets very clunky.
Here's another simplified example that's even more general:
def my_eval(fun, **kwargs):
x = 6
z = 1
return fun(**kwargs)
def my_fun(x, y, z):
return x + y + z
my_eval(my_fun, x=3, y=5, z=None)
I would like the result of my_eval()
to be 12, as x gets overwritten from 3 to 6 and z gets set to 1. I looked into functools.partial
but it didn't seem to work either.