5

Assume I have a C program (running under Linux) which manipulates many data structures, some complex, several of which can grow and shrink but should not in general grow over time. The program is observed to have a gradually increasing RSS over time (more so than can be explained by memory fragmentation). I want to find what is leaking. Running under valgrind is the obvious suggestion here, but valgrind (with --leak-check=full and --show-reachables=yes) shows no leak. I believe this to be because the data structures themselves are correctly being freed on exit, but one of them is growing during the life of the program. For instance, there might be a linked list which is growing linearly over time, with someone forgetting to remove the resource on the list, but the exit cleanup correctly freeing all the items on the list at exit. There is a philosophical question as to whether these are in fact 'leaks' if they are freed, of course (hence the quote marks in the question).

Are there any useful tools to instrument this? What I'd love is the ability to run under valgrind and have it produce a report of current allocations just as it does on exit, but to have this happen on a signal and allow the program to continue. I could then look for what stack trace signatures had growing allocations against them.

I can reliably get a nice large 'core' file from gdb with generate-core-file; is there some way to analyse that off-line, if say I compiled with a handy malloc() debugging library that instrumented malloc()?

I have full access to the source, and can modify it, but I don't really want to instrument every data structure manually, and moreover I'm interested in a general solution to the problem (like valgrind provides) rather than how to address this particular issue.

I've looked for similar questions on here but they appear all to be:

  • Why does my program leak memory?
  • How do I detect memory leaks at exit? (no use for me)
  • How do I detect memory leaks from a core file? (great, but none has a satisfactory answer)

If I was running under Solaris I'm guessing the answer would be 'use this handy dtrace script'.

abligh
  • 24,573
  • 4
  • 47
  • 84
  • I am very fund of valgrind and am pretty sure valgrind will show you these "leaks". If you free the list entry and forgot to free the resources within the list entry these will show up in valgrind as "are definitely lost". Just dont use `--show-reachables=yes` because this will hide what I just explained, since it is infact unreachable. – Montaldo Jun 18 '14 at 08:28
  • 1
    He's just confused with the terminology. As long as the memory is reachable, it's not lost or leaking. Please rewrite the question, and refrain from using the word "leak". It's NOT a leak. – Karoly Horvath Jun 18 '14 at 08:31
  • @Montaldo I am very familiar with `valgrind`; It does not show me the leaks because they are freed on exit. – abligh Jun 18 '14 at 08:37
  • @KarolyHorvath please see the bit in my question where I explained the 'leaks' are freed on exit, and that there is a philosophical question as to whether these should be called 'leaks' at all. However, they are 'leaks' in the sense they are using memory for no purpose; whilst the program has not lost track of them in every data structure, they obviously aren't doing anything useful. I think I've made it sufficiently obvious that they are not normal leaks. – abligh Jun 18 '14 at 08:39
  • @abligh Explain me, how is something a "leak" when it is freed on exit. – Montaldo Jun 18 '14 at 08:41
  • @Montaldo, using a reasonable definition (the first google threw up) "a failure in a program to release discarded memory, causing impaired performance or failure." - that's what's happening. Note that with any program *all* memory is *always* freed on exit (when the pages are unmapped), and until the pages are unmapped something always (even if it's only the OS and the `malloc` allocation structures themselves) have a reference to the memory. So looking at leaked memory as only stuff that is not freed by a call to `free()` on exit is too narrow a view. – abligh Jun 18 '14 at 08:46
  • man, all I can say: you're waaaay too theoretical. – Karoly Horvath Jun 18 '14 at 08:47
  • @KarolyHorvath Does it matter ? Finding application bugs that fail to release memory is still valuable. – nos Jun 18 '14 at 08:48
  • 2
    Well, wouldn't be life a lot easier if you called "leak" what everybody else on this blue planet calls "leak"? Am I too pragmatic? :) – Karoly Horvath Jun 18 '14 at 08:49

2 Answers2

5

Valgrind includes a gdbserver. This basically means you can use gdb to connect to it, and e.g. issue a leak dump, or to show all reachable memory while running. Ofcourse, you have to judge whether there is a "memory leak" or not, as valgrind can't know if there's a bug in the application logic that fails to release memory, but still keep references to it.

Run valgrind with the --vgdb=yes flag and then run the commands:

valgrind --vgdb=yes --leak-check=full --show-reachable=yes ./yourprogram 
gdb ./yourprogram
(gdb) target remote | vgdb
(gdb) monitor leak_check full reachable any

See the docs for more info, here and here

You can also do this programatically in your program

#include <valgrind/memcheck.h>

and at an appropriate place in the code do:

 VALGRIND_DO_LEAK_CHECK;

(iirc that'll show reachable memory too, as long as valgrind is run with --show-reachable=yes

nos
  • 223,662
  • 58
  • 417
  • 506
  • Thanks - that looks fantastically useful, especially VALGRIND_DO_ADDED_LEAK_CHECK. It would be even better if I could get the output to a file, but I shall have a poke around. – abligh Jun 18 '14 at 08:43
  • You could always redirect stdout/stderr to a file, and I'm sure gdb has some commands to save output to a file too. – nos Jun 18 '14 at 08:44
  • I haven't tried it but I think the *programatic* calls will appear on the STDERR of the target won't they? That's directed to `/dev/null` in this case (it's a daemon). Anyway, that's a trivial issue to solve. – abligh Jun 18 '14 at 08:48
1

There's the Valgrind Massif tool which shows general memory usage of your application, not just for leaked memory. It breaks down malloc()s and free()s by calling functions and their backtraces, so you can see which functions keep allocating memory without releasing it. This can be an excellent tool for finding leaks of the type you mentioned.

Unfortunately the tooling around Massif is a bit weird... The ms_print tool provided with Valgrind is only useful for the most basic tasks; for real work you probably want something that displays a graph. There are several tools for this strewn around the net - see eg. Valgrind Massif tool output graphical interface? .

Community
  • 1
  • 1
oliver
  • 6,204
  • 9
  • 46
  • 50