2

I am running some numerical simulations, in which my main function must receive lots and lots of arguments - I'm talking 10 to 30 arguments depending on the simulation to run.

What are some best practices to handle cases like this? Dividing the code into, say, 10 functions with 3 arguments each doesn't sound very feasible in my case.

What I do is create an instance of a class (with no methods), store the inputs as attributes of that instance, then pass the instance - so the function receives only one input.

I like this because the code looks clean, easy to read, and because I find it easy to define and run alternative scenarios.

I dislike it because accessing class attributes within a function is slower than accessing a local variable (see: How / why to optimise code by copying class attributes to local variables?) and because it is not an efficient use of memory - too much data stored multiple times unnecessarily.

Any thoughts or recommendations?

myinput=MyInput()
myinput.input_sql_table = that_sql_table
myinput.input_file = that_input_file
myinput.param1 = param1
myinput.param2 = param2
myoutput = calc(myinput)

Alternative scenarios:

inputs=collections.OrderedDict()
scenarios=collections.OrderedDict()
inputs['base scenario']=copy.deepcopy(myinput)

inputs['param2 = 100']=copy.deepcopy(myinput)
inputs['param2 = 100'].param2 = 100
# loop through all the inputs and stores the outputs in the ordered dictionary scenarios
halfer
  • 19,824
  • 17
  • 99
  • 186
Pythonista anonymous
  • 8,140
  • 20
  • 70
  • 112

2 Answers2

3

I don't think this is really a StackOverflow question, more of a Software Engineering question. For example check out this question.

As far as whether or not this is a good design pattern, this is an excellent way to handle a large number of arguments. You mentioned that this isn't very efficient in terms of memory or speed, but I think you're making an improper micro-optimization.

As far as memory is concerned, the overhead of running the Python interpreter is going to dwarf the couple of extra bytes used by instantiating your class.

Unless you have run a profiler and determined that accessing members of that options class is slowing you down, I wouldn't worry about it. This is especially the case because you're using Python. If speed is a real concern, you should be using something else.

You may not be aware of this, but most of the large scale number crunching libraries for Python aren't actually written in Python, they're just wrappers around C/C++ libraries that are much faster.

I recommend reading this article, it is well established that "Premature optimization is the root of all evil".

Nick Chapman
  • 4,402
  • 1
  • 27
  • 41
  • Yes, I a familiar with the concept of 'premature optimisation', and I know that numpy and many other libraries are compiled in faster languages; in the link in my original post, an example of the difference using local variables leads to a 30ish% difference in speed, which is material enough in my case – Pythonista anonymous Mar 25 '19 at 14:38
  • @Pythonistaanonymous have you actually run a profiler to determine whether or not this matters outside of your toy example? – Nick Chapman Mar 25 '19 at 14:41
  • As for memory, I suppose another way to look at this is to have one input class for the attributes which don't occupy a lot of memory space, and another one for those which do - typically one large dataframe of a few hundred MBs. This way I wouldn't be storing this large dataframe multiple times unnecessarily. The code would get a bit more verbose and harder to read, so I have to think carefully of the balance of pros and cons. – Pythonista anonymous Mar 25 '19 at 14:43
  • Yes, I don't remember the exact results by heart, but I do remember the profiler showed a lot of time was spent getting attributes from the class. This specific part can be solved by creating local variables that read from the class, as in the toy example. – Pythonista anonymous Mar 25 '19 at 14:45
2

You could pass in a dictionary like so:

all_the_kwargs = {kwarg1: 0, kwarg2: 1, kwargN: xyz}
some_func_or_class(**all_the_kwargs)

def some_func_or_class(kwarg1: int = -1, kwarg2: int = 0, kwargN: str = ''):
    print(kwarg1, kwarg2, kwargN)

Or you could use several named tuples like referenced here: Type hints in namedtuple

also note that depending on which version of python you are using there may be a limit to the number of arguments you can pass into a function call.

Or you could use just a dictionary:

def some_func(a_dictionary):
   a_dictionary.get('argXYZ', None) # defaults to None if argXYZ doesn't exist
jmunsch
  • 22,771
  • 11
  • 93
  • 114