Apart from external inspection, you can also instrument your implementation of malloc to let you inspect those statistics. jemalloc
and tcmalloc
are implementations that, on top of performing better for multithreaded code that typical libc implementations, add some utility functions of that sort.
To dig deeper, you should learn a bit more how heap allocation works. Ultimately, the OS is the one assigning memory to processes as they ask for it, however requests to the OS (syscalls) are slower than regular calls, so in general an implementation of malloc
will request large chunks to the OS (4KB or 8KB blocks are common) and the subdivise them to serve them to its callers.
You need to identify whether you are interested in the total memory consumed by the process (which includes the code itself), the memory the process requested from the OS within a particular procedure call, the memory actually in use by the malloc
implementation (which adds its own book-keeping overhead, however small) or the memory you requested.
Also, fragmentation can be a pain for the latter two, and may somewhat blurs the differences between really used and assigned to.