I haven't found a good way to monitor the memory usage of a Python script using multiprocessing
. More specifically, say I do this:
import time
biglist = range(pow(10, 7))
time.sleep(5)
The memory usage is 1.3 GB, as measured by both /usr/bin/time -v
and top
. But now, say I do this:
import time
from multiprocessing import Pool
def worker(x):
biglist = range(pow(10, 7))
time.sleep(5)
return
Pool(5).map(worker, range(5))
Now top
reports 5 x 1.3 GB, which is correct. But /usr/bin/time -v
still reports 1.3 GB, which doesn't make sense. If it is measuring the consumption of the parent process, then it should say 0. If it is measuring the parent and the children, then it should report 5 x 1.3 GB. Why does it say 1.3 GB? Now let's try copy-on-write:
import time
from multiprocessing import Pool
biglist = range(pow(10, 7))
def worker(x):
time.sleep(5)
return
Pool(5).map(worker, range(5))
Now /usr/bin/time -v
reports 1.3 GB (again), which is correct. But top
reports 6 x 1.3 GB, which is incorrect. With copy-on-write, it should only report 1.3 GB.
How can I reliably monitor the memory usage of a Python script using multiprocessing
?