1

I searched a lot for this but I didn't find any clear answer. I am trying to find memory leak problems in a mpi code using valgrind. I was reducing the code to a minimal state in order to isolate the problem and then I realized that even a simple code like the following:

program foo
  include "mpi.h"
  implicit none
  integer :: ierr

  call MPI_init(ierr)
  call MPI_finalize(ierr)
end program

gives memory leaks. Compiling with

mpif90 foo.f90

and running

mpirun -n 1 valgrind ./a.out 

gives:

==523== Memcheck, a memory error detector
==523== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==523== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==523== Command: ./a.out
==523==
==523== Conditional jump or move depends on uninitialised value(s)
==523==    at 0x5AA4236: opal_value_unload (in /home/francesco/openmpi/lib/libopen-pal.so.20.10.1)
==523==    by 0x57C8C0A: ompi_proc_complete_init (in /home/francesco/openmpi/lib/libmpi.so.20.10.1)
==523==    by 0x57CC5FE: ompi_mpi_init (in /home/francesco/openmpi/lib/libmpi.so.20.10.1)
==523==    by 0x57EBD52: PMPI_Init (in /home/francesco/openmpi/lib/libmpi.so.20.10.1)
==523==    by 0x4E7E5A7: MPI_INIT (in /home/francesco/openmpi/lib/libmpi_mpifh.so.20.11.0)
==523==    by 0x4009FE: MAIN__ (in /mnt/c/Users/Utente/Lavoro/prove/MPI/a.out)
==523==    by 0x400A46: main (in /mnt/c/Users/Utente/Lavoro/prove/MPI/a.out)
==523==
==523==
==523== HEAP SUMMARY:
==523==     in use at exit: 83,013 bytes in 609 blocks
==523==   total heap usage: 16,641 allocs, 16,032 frees, 3,629,830 bytes allocated
==523==
==523== LEAK SUMMARY:
==523==    definitely lost: 11,711 bytes in 13 blocks
==523==    indirectly lost: 1,094 bytes in 28 blocks
==523==      possibly lost: 0 bytes in 0 blocks
==523==    still reachable: 70,208 bytes in 568 blocks
==523==         suppressed: 0 bytes in 0 blocks
==523== Rerun with --leak-check=full to see details of leaked memory
==523==
==523== For counts of detected and suppressed errors, rerun with: -v
==523== Use --track-origins=yes to see where uninitialised values come from
==523== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

My questions are:

1) Why this simple code gives memory leak? Am I doing something wrong or missing something?

2) I read that valgrind can gives false positives with mpi, if it is the case, how can I understand which errors are related to valgrind-mpi incompatibility and which are related to my mistake in a code?

fdv
  • 55
  • 8
  • *"I read that valgrind can gives false positives with mpi"* It would be good if you can point out what you read so that we don't tell you to read something you already read. If you do have some knowledge about that you should narrow your question to be specifically about your point 2. Especially tell us why the duplicate link above is not sufficient and what you need to know more. 1. **Narrow** your question down so that it is not a duplicate. 2. Ask **one question per post**! 3. Be specific! 4. read other questions on this site first! There are quite a few more, not just the one I linked! – Vladimir F Героям слава Mar 19 '18 at 13:49
  • *"how can I understand which errors are related to valgrind-mpi incompatibility"* By reading the address where the leak happens or by tracking the origins. If it is in MPI_Init (like here), it is extremely likely it is not your fault. Ask for more details if you think it is needed, but be specific. – Vladimir F Героям слава Mar 19 '18 at 13:55
  • @Vladimir I read about the false positives here [link](https://stackoverflow.com/questions/34851643/using-valgrind-to-spot-error-in-mpi-code) in the comment to the first answere. You are right, this is exactly the question you are notifiyng me. I didn't find it before. – fdv Mar 19 '18 at 14:14
  • @Vladimir what do you suggest me to do if all the errors in the full code start from an MPI_Init or MPI_finalize and the memory still increase while the code is running? I think this issue is related to a memory leak insed my code. – fdv Mar 19 '18 at 14:16
  • To prepare a [mcve] and ask about that particular code with all your observations about the memory consumption and a the output of valgind. A specific question, instead of a broad general one. It is more likely to solve your actual problem. Remember that the code must contain the problem, it must not be oversimplified. – Vladimir F Героям слава Mar 19 '18 at 14:54
  • Also, very likely this is not going to change anything here, but it is better to use `use mpi` instead of `include "mpif.h"`. The compiler can then check the correctness of your calls better. – Vladimir F Героям слава Mar 19 '18 at 14:56

0 Answers0