Python - multithreading / multiprocessing

Question

Background
I have a collection of Python scripts used to build and execute Verilog-AMS tesbenches. The overall design was built with threading in mind, as each major test case is its own testbench and I have all of the supporting files / data output separate for each instance. The only shared items will be the launcher script and my data extraction script. The problem that I'm faced with is that my Verilog-AMS simulator does not natively support multithreading and for my test cases it takes a substantial amount of time to complete.

Problem
The machine I'm running this on has 32GiB of RAM and 8 "cores" available for me to use and I may be able to access a machine with 32. I would like to take advantage of the available computing power and execute the simulations simultaneously. What would be the best approach?

I currently use subprocess.call to execute my simulation. I would like to execute up to n commands at once, with each one executing on a separate thread / as a separate process. Once a simulation has completed, the next one in the queue (if one exists) would execute.

I'm pretty new to Python and haven't really written a threaded application. I would like some advice on how I should proceed. I saw this question, and from that I think the multiprocessing module may be better suited to my needs.

What do you all recommend?

score 4 · Accepted Answer · edited Mar 05 '16 at 19:20

I had done some similar task in the past with Machine Learning and Data Mining. Using multiprocessing in your case may not be that difficult of a task. It depends on how tolerant you are keen on making the program, you can use a Threaded Pool pattern. My personal favourite is Producer - Consumer pattern using Queue, this design can handle a variety of complex task. Here is a sample toy program using multiprocessing:

import multiprocessing
from multiprocessing import Queue, Process
from Queue import Empty as QueueEmpty

# Assuming this text is very very very very large
text="Here I am writing some nonsense\nBut people will read\n..."

def read(q):
   """Read the text and put in a queue"""
   for line in text.split("\n"):
       q.put(line)

def work(qi, qo):
   """Put the line into the queue out"""
   while True:
        try:
            data = qi.get(timeout = 1) # Timeout after 1 second
            qo.put(data)
        except QueueEmpty:
            return # Exit when all work is done
        except:
            raise # Raise all other errors

def join(q):
    """Join all the output queue and write to a text file"""
    f = open("file.txt", w)
    while True:
         try:
            f.write(q.get(timeout=1))
         except QueueEmpty:
            f.close()
            return
         except:
            raise

def main():
   # Input queue
   qi = Queue()
   # Output queue
   qo = Queue()
   # Start the producer
   Process(target = read, args = (qi, )).start()
   # Start 8 consumers
   for i in range(8):
        Process(target = work, args = (qi, qo, )).start()
   # Final process to handle the queue out
   Process(target = join, args = (qo, )).start()

Type this from memory so if there is any error, please correct. :)

Thanks, this helps. Do you know if this will play nicely when executing external commands using `subprocess.call`, such that each item in the queue would need to execute an external command, e.g. `program_name.exe -arguments` — James Mnatzaganian, Apr 12 '13 at 16:49
I have yet to try combining `subprocess` and `multiprocessing`; however, I do not think it would be a problem as long as you handle the communication in `subprocess` correctly. I have seen implementation using both in [this question](http://stackoverflow.com/questions/884650/how-to-spawn-parallel-child-processes-on-a-multi-processor-system). It seems like what you are looking for. — nqngo, Apr 13 '13 at 06:27
Thanks, that helps a lot! I think between your post and that one, I should be all set :) — James Mnatzaganian, Apr 13 '13 at 14:57

Python - multithreading / multiprocessing

1 Answers1