1

I implemented pool.map of multiprocessing for a long list of independent and very expensive operations as described in my previous question Distribute many independent, expensive operations over multiple cores in python.

Still even with multiple cores, the work can take several hours. I would like to provide some simple visual cue of the operation's progress. To experiment, I tried printing the ID number of each item in the list from within the mapped function but 1) in an IDE it does not show up until all operations have completed entirely (less problematic), and 2) the operations are not completed in order (more problematic).

What's the best way of going about this?

Community
  • 1
  • 1
SkyNT
  • 718
  • 1
  • 8
  • 19
  • If you need to influence and work with the tasks either switch to the `threading` library and structure your work task to have a "interface" that the backend can talk to for giving you feedback of the task at hand or wait for someone more experienced answer regarding the pool object. – Torxed Feb 06 '13 at 18:30

1 Answers1

0
from threading import *
from time import sleep

class worker(Thread):
    def __init__(self, params = None):
        Thread.__init__(self)

        self.params = params
        self.status = 0.0
        self.start()

    def run(self):
        while self.status < 1.0:
                    # <--- This would be where you execute
                    #      your demanding/costly operations
                    # Also, update your status (progress)
            self.status += 0.1
            sleep(0.1)

x = worker()
y = worker()

while x.status < 1.0 and y.status < 1.0:
    print 'X status:', x.status
    print 'Y status:', x.status

Note: The 1.0 counter limit is just there to give you a demo. In a real life operation you would just let the thread either be alive forever with an enldess loop or you'd have the run() function do your calculations and then die after you've retrieved your desired value which you can store in a likewise manner to the self.status variable.

Torxed
  • 22,866
  • 14
  • 82
  • 131
  • 1
    Also note that this is a manual way of executing tasks, not using the multiprocessing pool object which is there to do `quick and dirty distributions´ from what I've learned. In this approach you got full control over what's being executed and reported back. However, these are limited to Python being a **Single threaded software** and there's no work around that. – Torxed Feb 06 '13 at 18:44