2

Linux has copy-on-write, which means that, after a fork, a child process can share the memory with the parent process as long as it doesn't modify it.

Let's say the parent process takes 10 GB of physical RAM. When I fork the process, the physical memory used by the OS doesn't immediately go up by 10 GB (it may go up slightly due to the creation of some administrative structures). This can be confirmed using free shell command. Thus free correctly accounts for CoW.

However, when I ask the OS about the amount of memory used by a specific process (e.g., using top or any C API function I am aware of), it shows that the physical memory used by the child process is 10 GB right away (before it modifies anything). Thus per-process memory tracking functions don't correctly account for CoW.

I am looking for a way to measure per-process memory accounting for CoW. (Going to use it from python, but once I know the relevant C API, I'm fine.)

To clarify: the shared memory used by multiple processes should be allocated, for accounting purposes, to the parent process.

USE CASE:

We're trying to reduce the total memory used by an application. We have very large data structures in the parent process, which are shared with the child processes by simple forking. We don't need to modify those structures in child processes, but modifications to reference counters (in python) causes parts of the memory to be copied. We're trying to minimize the extent to which this happens to preserve physical memory.

RELATED QUESTIONS

https://serverfault.com/questions/676335/how-measure-memory-without-copy-on-write-pages (provides a possible answer)

How to know whether a copy-on-write page is an actual copy? (provides some useful details to create a solution)

Community
  • 1
  • 1
max
  • 49,282
  • 56
  • 208
  • 355
  • 1
    Both processes' virtual pages map to the same physical pages; there's no unequivocal answer to "who owns that physical memory"... – Oliver Charlesworth Apr 11 '15 at 17:01
  • A natural definition of which process owns the memory, is that that parent process owns the shared memory, since it already was used before the child process started. This would allocate (nearly) zero memory to the child process until it actually starts modifying it. There may be cases where this definition isn't useful, but in the most obvious case, this can help analyze and improve memory usage. – max Apr 12 '15 at 21:55
  • That's a nice idea, but in practice there's nothing that tracks this. – Oliver Charlesworth Apr 12 '15 at 22:03
  • I don't think worrying about who "owns" the memory is productive - focus instead on figuring out which pages are mapped to multiple processes, and which are not. Is there any way you could set some pages to read-only before the fork for testing? That would presumably cause a protection fault if your code tries to write to them, thus letting you know about those cases at least to the extent your tests cover the pathways. Can you re-work things somehow using interprocess shared memory that is read only from one side, rather than copy-on-write behavior after a fork? – Chris Stratton Apr 13 '15 at 00:11
  • @ChrisStratton Unfortunately, since python maintains refcounts for objects, it potentially needs to modify any part of memory, even the memory that refers to the immutable objects. We're just trying to make it happen less often, but if we set pages to read-only, the program won't run. As to other IPC methods, we're looking into that, but nothing is really perfect. – max Apr 13 '15 at 00:32
  • 1
    It seems like what you would really want to do would be to group the data in one place, apart from its metadata (such as reference counts) in another, so that the former can remain in many unmodified pages and the later be concentrated in a few modified ones. Perhaps changing the size of structures (fewer bigger ones, with manual distinction within) could help. A real solution would be to modify python to store the two categories with distinct allocators but that would be a big project. – Chris Stratton Apr 13 '15 at 00:36
  • @ChrisStratton agreed.. though I have no control over where python stores its metadata, all we can do is create larger C-based structures so that refcount modifications don't force too much memory to be copied. (Only 4KB pages get copied, so if one refcount hanldes a 1GB C data structure, we're good.) – max Apr 13 '15 at 00:41

1 Answers1

1

I don't know of any way to solve this outside the kernel - you'd need to go through the virtual-to-physical mappings of every process, then correlate physical mappings between processes while accounting for swapped out memory that didn't have a physical mapping. And by the time you got finished, your answer would no longer be correct.

I know of no OS that provides what you're asking. If it were worth solving I have to think someone would have done so.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • 1
    I updated the question to explain why it is useful. Can you clarify what you mean by "no physical mapping"? I thought the shared memory that I was asking about would have the mapping to physical memory from both parent and child processes, but the mapping will be to the *same* physical memory, no? – max Apr 12 '15 at 22:04
  • @max: some memory may be swapped out, other memory may not yet have been allocated a physical mapping, yet other memory may deliberately be mapped to a file. – Oliver Charlesworth Apr 12 '15 at 22:12