0

I have a VueJs front-end, a python Flask back-end and I'm using Google Firebase database..

In some tasks there is a progression bar that is showed to the user. In order to realize the bar, back-end writes progressive values into databases and front-end reads it.

I changed the back-end because it was too slow, so I pararelized some operations using multiprocessing.Pool. I had a good performance improvement, but now the progressive bar sometimes goes on and then come back because there are more processes that are accessing to db.

Here I create the pool:

i = 1
num_workers = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes=num_workers)
for filename in glob.glob(os.path.join(files_path, '*')):


    pool.apply_async(create_single_model, (filename, models_path, num_model_photos, uid, i))
    print('Process: ' + str(i))
    i = i + 1

The in create_single_model I pass i and increment it, but I think it's not correct.:

.
some operations...
.

percentage = round((i / num_model_photos) * 100)
i = i + 1
ref = fb.db.collection('users').document(uid).get()
actual_percentage = ref.to_dict()['running']
if percentage != actual_percentage:
    fb.db.collection('users').document(uid).set({
        'running': percentage
    }, merge=True)
Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807
francesco
  • 43
  • 1
  • 6

1 Answers1

1

This looks like each subprocess is calculating their own progress instead of the overall progress. You would need a shared variable that is accessible by each of the subprocesses to store the overall progress. Each subprocess can increase that variable by 1 when a job is done.

Following this answer:
Sharing a counter with multiprocessing.Pool

I managed to come up with a simple example:

import time
from multiprocessing import Pool, Value


def some_func(n):
    for _ in range(n):
        with cnt.get_lock():
            cnt.value += 1
            print('\r{:.0%}'.format(cnt.value / 100), end='')
            time.sleep(0.1)


def init_globals(counter):
    global cnt
    cnt = counter


if __name__ == "__main__":
    counter = Value('i', 0)

    with Pool(initializer=init_globals, initargs=(counter,), processes=2) as pool:
        multi_results = pool.starmap(some_func, [(50,), (50,)])
Z Li
  • 4,133
  • 1
  • 4
  • 19