2

I have a simple project where I need to print in parallel progress info, such as a progress bar.

Each bar has a position and the writing cursor in the terminal is moved up and down depending on the bar's position.

This works well when done in serial, but it fails when printing in parallel because of racing issues. I tried to use a multiprocessing.Lock() but to no avail.

Here is my current code:

from __future__ import division

import os, sys
import signal
from time import sleep
from multiprocessing import Pool, freeze_support, Lock

if os.name == 'nt':
    import colorama  # to support cursor up
    colorama.init()

_term_move_up = '\x1b[A'

write_lock = Lock()

class simple_bar(object):
    def __init__(self, iterable, desc='', position=0):
        signal.signal(signal.SIGINT, signal.SIG_IGN)  # handles keyboardinterrupt
        self.iterable = iterable
        self.total = len(iterable)
        self.n = 0
        self.position = position
        self.desc = desc
        self.display()

    def __iter__(self):
        for obj in self.iterable:
            yield obj
            self.update()

    def update(self, n=1):
        self.n += n
        self.display()

    def display(self, fp=None, width=79):
        if not fp:
            fp = sys.stdout

        with write_lock:
            fp.write('\n' * self.position)
            l_part = self.desc + ': '
            bar = l_part + '#' * int((self.n / self.total) * (width - len(l_part)))
            fp.write('\r' + bar + ' ' * (width - len(bar)))
            fp.write(_term_move_up * self.position)
            fp.flush()

def progresser(n):         
    text = "progresser #{}".format(n)
    for i in simple_bar(range(5000), desc=text, position=n):
        sleep(0.001)

if __name__ == '__main__':
    freeze_support()
    L = list(range(3))
    Pool(len(L)).map(progresser, L)

Serial alternative that works ok, this gives the correct output that should be produced by the parallel version above:

# Same code as above, except __main__

if __name__ == '__main__':
    t_list = [simple_bar(range(5000), desc="progresser #{}".format(n), position=n) for n in xrange(3)]
    for i in range(5000):
        for t in t_list:
            t.update()

I have no idea what is going wrong. I am using Python 2.7.12 on Windows 7.

I am looking for a way to print in parallel safely in multiprocessing and ideally but optionally thread-safely.

/EDIT: interestingly, if I put a wait (but big enough) just before printing, then the bars are printed alright:

# ...
    def display(self, fp=None, width=79):
        if not fp:
            fp = sys.stdout

        with write_lock:
            sleep(1)  # this fixes the issue by adding a delay
            fp.write('\n' * self.position)
            l_part = self.desc + ': '
            bar = l_part + '#' * int((self.n / self.total) * (width - len(l_part)))
            fp.write('\r' + bar + ' ' * (width - len(bar)))
            fp.write(_term_move_up * self.position)
            fp.flush()
# ...

I don't know what conclusion this implies.

gaborous
  • 15,832
  • 10
  • 83
  • 102
  • Not sure I understand correctly. Do you want to concurrently process some job and print a progress bar when a part of it is done? Is it of any importance whether the progress is printed by your subprocesses or by your main process? – noxdafox Oct 30 '16 at 14:32
  • @noxdafox Yes to the first question, for the second yes the progress should be printed from the subprocess, this is the issue. From the main process there is no issue as there is no concurrency involved. – gaborous Oct 30 '16 at 14:47

2 Answers2

4

You need to add fp.flush() before write_lock.release().

Unrelated comments:

  • Consider using the lock as a context manager (with write_lock... instead of manual acquire() and release()) — that is easier to follow and less error-prone.
  • Neither version handles interruptions (Ctrl+C) well, you may want to look into that.
Vasiliy Faronov
  • 11,840
  • 2
  • 38
  • 49
  • Thank you for your suggestions, but sorry this does not fix the issue, see the updated code above. In my original code, fp.flush() was called, I forgot to add it in this compactified version, sorry about that, but anyway it does not help. – gaborous Oct 30 '16 at 05:47
1

This might be a problem with the global lock variable. When you are creating the child process in unix you have a copy of the memory of the parent. In windows it seems that it not the case

Try this code

from __future__ import division
import os, sys
import signal
from time import sleep
from multiprocessing import Pool, freeze_support, Lock

if os.name == 'nt':
    import colorama  # to support cursor up
    colorama.init()

_term_move_up = '\x1b[A'



class simple_bar(object):
    def __init__(self, iterable, desc='', position=0):
        signal.signal(signal.SIGINT, signal.SIG_IGN)  # handles keyboardinterrupt
        self.iterable = iterable
        self.total = len(iterable)
        self.n = 0
        self.position = position
        self.desc = desc
        self.display()

    def __iter__(self):
        for obj in self.iterable:
            yield obj
            self.update()

    def update(self, n=1):
        self.n += n
        self.display()

    def display(self, fp=None, width=79):
        if not fp:
            fp = sys.stdout

        with write_lock:
            fp.write('\n' * self.position)
            l_part = self.desc + ': '
            bar = l_part + '#' * int((self.n / self.total) * (width - len(l_part)))
            fp.write('\r' + bar + ' ' * (width - len(bar)))
            fp.write(_term_move_up * self.position)
            fp.flush()

def progresser(n):
    text = "progresser #{}".format(n)
    for i in simple_bar(range(5000), desc=text, position=n):
        sleep(0.001)

def init_child(lock_):
    global write_lock
    write_lock = lock_

if __name__ == '__main__':
    write_lock = Lock()
    L = list(range(3))
    pool = Pool(len(L), initializer=init_child, initargs=(write_lock,))
    pool.map(progresser, L)
Alexey Smirnov
  • 2,573
  • 14
  • 20
  • Good catch, it works. Damn Windows... However ideally I would like to manage the lock transparently for the parent, without requiring the parent to provide a lock (the lock should be created by the children, maybe by the class or I don't know what). Do you think this is possible? – gaborous Oct 31 '16 at 11:18
  • No way around, as Alexey said, Windows cannot fork processes but only spawn them, so the child process has no access to parent's data. We need to pass the lock from the parent to the children. See also: http://stackoverflow.com/a/28721419 and http://rhodesmill.org/brandon/2010/python-multiprocessing-linux-windows/ – gaborous Dec 26 '16 at 12:26