3

I want to see the peak memory usage of a command. I have a parametrized algorithm and I want to know when the program will crash due with an out of memory error on my machine (12GB RAM).

I tried:

/usr/bin/time -f "%M" command
valgrind --tool=massif command

The first one gave me 1414168 (1.4GB; thank you ks1322 for pointing out it is measured in KB!) and valgrind gave me

$ ms_print massif.out
--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 75 26,935,731,596       22,420,728       21,956,875       463,853            0

I'm a bit confused which number I should take, but let's assume "total" (22MB).

And the massif-visualizer shows me

enter image description here

Now I have 3 different numbers for the same command:

  • valgrind --tool=massif command + ms_print: 22MB
  • valgrind --tool=massif command + massif-visualizer: 206MB (this is what I see in htop and I guess this is what I'm interested in)
  • time -f "%M" command: 1.4GB

Which is the number I should look at? Why are the numbers different at all?

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958

1 Answers1

3

/usr/bin/time -f "%M" measures the maximum RSS (resident set size), that is the memory used by the process that is in RAM and not swapped out. This memory includes the heap, the stack, the data segment, etc.

This measures the max RSS of the children processes (including grandchildren) taken individually (not the max of the sum of the RSS of the children).

valgrind --tool=massif, as the documentation says:

measures only heap memory, i.e. memory allocated with malloc, calloc, realloc, memalign, new, new[], and a few other, similar functions. This means it does not directly measure memory allocated with lower-level system calls such as mmap, mremap, and brk

This measures only the memory in the child (not grandchildren). This does not measure the stack nor the text and data segments.

(options likes --pages-as-heap=yes and --stacks=yes enable to measure more)

So in your case the differences are:

  • time takes into account the grandchildren, while valgrind does not
  • time does not measure the memory swapped out, while valgrind does
  • time measures the stack and data segments, while valgrind does not

You should now:

  • check if some children are responsible of the memory consumption
  • try profiling with valgrind --tool=massif --stacks=yes to check the stack
  • try profiling with valgrind --tool=massif --pages-as-heap=yes to check the rest of the memory usage
user803422
  • 2,636
  • 2
  • 18
  • 36
  • Thank you! (+50) - I will check this (probably next week) and might come back with questions (or directly accept the answer :-) ) – Martin Thoma Sep 26 '19 at 09:28