Measuring Memory Consumption used by a concurrent code in C++

Question

I'm implementing the FineList and the LazyList classes in C++. Both the above concurrent linked lists have been implemented in Java in the book "The Art of Multiprocessor Programming". I'd like to measure the amount of memory consumed by each algorithm during it's entire course of execution. I'm not sure how do I do that. I could just keep track of the number of times "new" is called by each thread, in a counter, in both algorithms and use that as a measuring criteria. Similarly, whenever a thread calls "delete", I'd decrement the counter. Is that a fair criteria to measure memory consumption? The problem is FineList algorithm allows me to immediately "delete" a node once it's removed from the linked-list, due to it's lock-based nature. But that's not the case in LazyList algorithm, since it has lock-free methods. Is there any other way to measure memory consumption or is the above method fair to both algorithms?

What operating system/environment are you using? There are tools available to do this — Dominic Price, Jun 19 '18 at 19:40
Note that the number of `new` and `delete` invocations does not necessarily correlate to the amount of memory allocated (though it does indicate churn/allocation overhead). — Cameron, Jun 19 '18 at 19:44
getrusage() might be what you’re looking for, you can call it at various points during your function. See https://stackoverflow.com/questions/63166/how-to-determine-cpu-and-memory-consumption-from-inside-a-process for example — Dominic Price, Jun 19 '18 at 19:44
@DominicPrice, if I use getrusage() at the end of the algorithm execution, would it show the actual memory consumption for the whole execution? — Dee Jay, Jun 19 '18 at 19:54
@Cameron, since I'm implementing it in C++, and it doesn't have a garbage collector, wouldn't the invocations show the exact amount of memory usage by each algorithm? — Dee Jay, Jun 19 '18 at 19:55
@DeeJay: Yes, if you also track the size of the allocations. For some reason, I thought you meant counting just the number of `new`/`delete` calls :-) Keep in mind a lot of code uses the underlying `malloc` directly, though. — Cameron, Jun 20 '18 at 14:50

BeeOnRope · Answer 1 · 2018-06-20T15:45:39.807

If you keep an ordered log of the new and delete calls (including the size requested) and you are sure the code you are interested in only uses new and delete and not other allocation routines, you can determine more or less the theoretical memory consumption at any point in time by keeping a running tally of the excess of memory allocated over memory freed. You could perhaps generate such a log by overloading global operator new(size_t) as well as delete.

The number will be only theoretical due to several factors:

Allocators add a certain amount of overhead, so the actual memory allocated will generally be larger than the sum of the sizes of the objects allocated. This overhead includes fragmentation, since some unallocated memory could be technically-free-but-not-really-available.
Allocators may not return any memory to the OS, or they may return it, but in an unpredictable way. So if you are measuring "memory allocated" as far as the OS is concerned (versus as far as the runtime allocator is concerned), you have a harder problem and this is highly allocator dependent.
Especially in a multi-threaded scenario, not all freed memory is necessarily used by future allocations. A specific case of this for thread-aware allocators is the use of thread-local allocation buffers: memory freed on one thread might not be immediately usable on another thread, until some threshold is reached whereby allocations can move across threads. This might be relevant for your scenario if there is a disparity of nodes allocated and freed among threads.
There is a whole layer of complexity when it comes to determining how much memory has been allocated - even to define that term. For example, for a large allocation, the OS may return you a chunk of memory which doesn't actually exist in RAM, and only page it is lazily as you access it. So if you don't access everything you allocate, the numbers reported by the allocator could actually be an over-estimate.

That's just scratching the surface.

score 0 · Answer 2 · answered Jun 20 '18 at 10:26

C++ allows you to provide your own operator new and matching operator delete. This is useful for such measurement tasks. You can even use this to figure out the memory used as a function of the allocation strategy, e.g. see how much more memory your algorithm needs when rounding allocation up to a multiple of 16 bytes.

The large benefit of this approach is that you don't need to touch the code of your algorithm itself. You can use the bookkeeping of your own allocator.

Mind you, the idea of lock-free programming can be overly optimistic if you actually need new. There's no guarantee in C++ whatsoever that new is lock-free. And since C++ allows you to new memory in one thread and delete it in another, some cross-thread synchronization is needed.

Measuring Memory Consumption used by a concurrent code in C++

2 Answers2