0

I'm working on a program that performs a Monte Carlo simulation of percolating systems (using python). In order to be able to run it from a GUI (tkinter) and to use multiple processes, I've defined the main part of the simulation in a main() function. The thing is, this program being a physical simulation, it takes in many parameters (10+). Some functions called from main() also need a lot of parameters and are called many, many times. For instance, in my main(), I have a generate_wire() function that takes in 8 parameters, such as wires_mean_length, wires_distribution, etc. This one is called millions of times.

Can that affect the efficiency of the program ? Is it something that should be fixed, and if so, how ?

EDIT: The code is basically structured as follow:

def generate_wire("8 parameters"):
     "generating a wire according to the parameters"

def main("main parameters"):
     for _ in range(nbr_sim):
         while True:
         generate_wire("8 parameters taken from the main parameters")
         "Various calculations"
         if percolation is True:
             break

if __name__ == '__main__':
     "GUI code"
     "Run button calls the main function with parameters from GUI entries"

Ethalides
  • 17
  • 4
  • please show some code or research and yes it probably affects performance – Matiiss Oct 29 '20 at 16:05
  • Done! Ty for your answer – Ethalides Oct 29 '20 at 16:17
  • now I start to understand that I have no idea about efficiency however I know tha multiple params probably is not a problem except maybe for organization and if the proccess of calculation is not shown graphically (just the result perhaps) then it should not be inefficient as much but welll for huge loops it will take some time(cannot tell precisely how much, actually that can be measured) – Matiiss Oct 29 '20 at 16:26
  • Indeed the calculation is not shown and only the results are stored. Measuring it is a good idea, I'll try it out. – Ethalides Oct 29 '20 at 16:31
  • I've run a foo function 100,000 times. Once using globals, the other passing 15 arguments to the function. I didn't get any significant time difference between the two processes. Of course, that doesn't prove anything, but I think both methods are reasonably efficient. – Ethalides Oct 29 '20 at 19:10

1 Answers1

1

Practically, this will not affect the runtime of the program compared to other design

You can bring all of your arguments into a dictionary or a custom class and pass that around to make the logic clearer

You could hoist the logic in your function directly into the loop, which will allow the lookups to occur less often

More about local variable lookups

def generate_wire_wrapper("8 parameters"):
    for _ in range(nbr_sim):
        "logic to generate a wire"
        "various calculations"

def main("main parameters"):
    generate_wire_wrapper("8 parameters taken from the main parameters")

However, improving design will be your real ally here

Instead of calling the same function thousands of times in a loop, consider

  • using some sort of a pool of workers to do the processing in parallel
  • taking advantage of (or writing if you have to) logic in C to make the operation more efficient science libraries' mapping methods do this for efficiency, part of which is also dodging the GIL by doing more work per step (Pandas dataframes .apply method does this, for example)
ti7
  • 16,375
  • 6
  • 40
  • 68
  • Thanks! I thought about passing it in a dictionnary but was wondering if it would really make a difference in terms of efficiency – Ethalides Oct 29 '20 at 16:22
  • @Ethalides probably not it would be almost the same as creating a new file and importing data from it and that aint slow or maybe I dont know how python works with imports – Matiiss Oct 29 '20 at 16:33
  • Arguments and dictionaries, etc. are by-reference, so they only incur the cost of looking up what the name refers to. You'll pay this cost more the more arguments and depth to get 'em you have. However, the lookup cost is incredibly small and is more efficient when using function vars than not (curiously, having methods of imports be arguments can make them more efficient as it dodges a lookup). – ti7 Oct 29 '20 at 16:38
  • Thank you for yor thorough answer! The problem is that I cannot run this function in paralell because I need the previous results at each iteration (Each time a wire is added, I'm checking if there is a percolation path). I like the wrapper solution though! And I will definitely look into pandas, I should've done so earlier actually. – Ethalides Oct 29 '20 at 17:41
  • @Ethalides anytime - if you find each step depends on the previous one, there may be some way to calculate more at once and unroll your loop. If you can make a minimal example, it could be an exciting/valuable question! Pandas and its `df.apply` is just an example of the technique, so it may not be applicable at all.. but if you have lots of data it certainly could take part in a good solution for you. – ti7 Oct 29 '20 at 18:36
  • Actually I realised I forgot an important stepof the code. I've edited it in. And so what I mean to do is to (instead of running generate_wire in paralell processes) running multiple simulations at the same time. I was thinking about using the multiprocessing module to do this, but now I wonder if pandas couldn't be more efficient. Will definitely try to figure it out. – Ethalides Oct 29 '20 at 18:47
  • Splendid (though it definitely has a syntax error at the latest revision) I added the python-internals tag to your post too, which may attract some more enlightened review – ti7 Oct 29 '20 at 18:49