0

I would like a way to trace how subprocesses in Python are using the memory. More precisely, I would like to get information about the variables used by each subprocess - whether they are copies of the main process variables, or they are shared among multiple subprocesses, etc.

Note that I am not interested in general techniques for debugging subprocesses in Python, I am aware of the numerous posts on that topic. I strictly want more information about the memory usage.

John Jaques
  • 560
  • 4
  • 17

2 Answers2

2

I think this answer can help you out. Python Memory Profiler

Example:

from guppy import hpy
h = hpy()
print h.heap()

Results

Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0  35144  27  2140412  26   2140412  26 str
1  38397  29  1309020  16   3449432  42 tuple
2    530   0   739856   9   4189288  50 dict (no owner)

Also, there is a module, its easy to use and you can see the operations and the vars.

memory_profiler

Looks something like this(extract from package's page):

Line #    Mem usage  Increment   Line Contents
==============================================
     3                           @profile
     4      5.97 MB    0.00 MB   def my_func():
     5     13.61 MB    7.64 MB       a = [1] * (10 ** 6)
     6    166.20 MB  152.59 MB       b = [2] * (2 * 10 ** 7)
     7     13.61 MB -152.59 MB       del b
     8     13.61 MB    0.00 MB       return a
Community
  • 1
  • 1
levi
  • 22,001
  • 7
  • 73
  • 74
1

In general, a subprocess doesn't share variables with its parent. It might share memory pages, though.

On POSIX systems Python launches a subprocess by using os.fork() followed by os.execvp() or os.execvpe(). (On Windows it uses CreateProcess(), but I'll focus on POSIX.)

After the fork, the child process is a carbon copy of the parent. But unlike threads, they do not share memory (or variables) by design; they are separate processes. A forked process does have a copy of its parent process's descriptors, pointing to the same files. Depending on the OS's implementation of virtual memory management they might share memory pages using copy on write ("COW").

However, shortly after that the child process calls os.execvp to replace itself with another program. Now this new program might still share some memory pages using COW depending on the aforementioned virtual memory implementation. But it cannot use variables from the process it replaces.

If you want the child process to be a Python function, you should look into multiprocessing. There you can share variables between processes.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94