5

In my program I need to spawn 16 multiprocessing pools, to utilise 16 cores on my 3950x. I have an initialiser which initialises three global variables in the spawn child processes (totalling around 300kb).

Before the using the initialiser each process took about 1s to spawn. Now 16 process takes around 100s in total! Any idea why this is now so slow? Sample code below:

def set_mp_global_vars(bc, bm, bf_n):
    """mp initialiser which sets global vars to reduce mp overhead"""
    global base_comps, base_matches, bf_names
    base_comps = bc
    base_matches = bm
    bf_names = bf_n

int_pool_workers = mp.cpu_count()
pool = mp.Pool(processes=int_pool_workers, initializer=set_mp_global_vars,
               initargs=(base_comps, base_matches, bf_names))
martineau
  • 119,623
  • 25
  • 170
  • 301
Kelvin Cheung
  • 51
  • 1
  • 2
  • Passing data between processes involves pickling on one side and un-pickling it on the other, as well as some disk I/O, which can be a lot of overhead — although 100x does indeed seem excessive… – martineau Jan 10 '21 at 00:58
  • But that is what I'm trying to avoid, but setting global vars so that I don't need to pickle and pass it to the processes each time it needs these vars. Say if I have 20,000 processes, then instead of pickling it 20,000 times I only have to initialise it 16 times for each pool. – Kelvin Cheung Jan 10 '21 at 03:22
  • Update: it appears it is the variable "bm" which is causing the slowness. Removing "bm" as a parameter and base_matches as a global var results in the 16 processes spawning in about 16s. "bm" is a nested defaultdict of ~8000 custom class instances. getsizeof says it is only about 300kb, but not sure if this is the reference object only rather than the true size. – Kelvin Cheung Jan 10 '21 at 12:00
  • It's been a while, but I recall reading that `getsizeof()` values are unreliable. – martineau Jan 10 '21 at 12:14
  • Update2: I've resolved the issue by saving the vars and then loading them from disk (instead of passing them into set_mp_global_vars). I still don't really understand why it takes so long to spawn a pool process. "bm" took about 34mb on disk, so it was much larger than getsizeof suggested. – Kelvin Cheung Jan 10 '21 at 12:20
  • 1
    `getsizeof()` returns the size of *that object*. If that object is a container with references to other objects, it only counts the memory holding those top-level references, not the size of the objects they refer to, recursively. For example `L = [os.urandom(1<<30)]` creates a list holding a reference to a 1GB buffer, but `sys.getsizeof(L) == 64`. – Mark Tolonen Jan 11 '21 at 02:34

1 Answers1

2

Using Python 3.9.2 on Windows 10 and spawning 30 processes on Threadripper CPU take 23s with Pycharm debugging on. If I run without debugging, it takes 0.21s.

pydev debugger slows down mp.Pool by a factor of 100x.

You may also look at: python-3-6-multiprocessing-pool-very-slow-to-start-using-windows

AndyP1970
  • 205
  • 1
  • 12