7

I don't know if this question is going to be clear, since I can't give too many details (I'm using a TPL and wrote a huge amount of lines myself). But I'll give it a try.

I am experiencing a segmentation fault which I can't understand. There is a structure (which I didn't design but should be well tested) whose destructor looks like this

Data::~Data()
{
  if(A_ != 0) {
    delete A_;
    A_ = 0;
  }

  if(B_ != 0) {
    delete B_;
    B_ = 0;
  }

  if(C_ != 0) {
    delete C_;
    C_ = 0;
  }
} // HERE

What's bothering me is that, while debugging, I get that the segfault happens at the line marked with 'HERE'. The class Data has only A_, B_ and C_ as dynamically allocated attributes. I also tried to explicitly call the destructor on the other non-dynamic composite attributes, to see if something went wrong during their destruction, but again the segfault happens at the end of the destructor. What kind of errors can give a segfault at that point?.

I hope the question is clear enough, I will add details if needed.

Edit: thanks for the replies. I know it's a scarse piece of code, but the whole library is of course too big (it comes from Trilinos, by the way, but I think the error is not their fault, it must be my mistake in handling their structures. I used short names to keep the problem more compact). Some remarks that somebody asked in the comment replies:

  • about the checks before the delete(s) and the raw pointers: as I said, it's not my choice. I guess it's a double protection in case something goes wrong and A_, B_ or C_ has been already deleted by some other owner of the data structure. The choice raw-pointers vs shared_ptr or other safe/smart pointers is probably due to the fact that this class is almost never used directly but only by an object of class Map that has a pointer to Data. This class Map is implemented in the same library, so they probably chose raw pointers since they knew what they were handling and how.
  • yes, the data structure is shared by all the copies of the same object. In particular, there is a Map class that contains a pointer to a Data object. All the Map's that are copies of one each other, share the same Data. A reference counter keeps track of how many Map's are holding a pointer to the data. The last Map to be destroyed, deletes the data.
  • the reference counter of the Data structure works correctly, I checked it.
  • I am not calling the destructor of this class. It is called automatically by the destructor of an object of class Map that has a pointer to Data as attribute.
  • Data inherits from BaseData, whose (virtual) destructor doesn't do anything, since it's just an interface defining class.
  • It's hard to post the code that reproduce the problem. For many reasons. The error appears only with more than 2 processes (it's an mpi program), and my guess it that a process has some list that is empty and tries to access some element.
  • about the error details. I can give you here the last items in the backtrace of the error during debugging (I apologize for the bad format, but I don't know how to put it nicely):

    1. 0x00007ffff432fba5 in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64

    2. 0x00007ffff43336b0 in abort () at abort.c:92

    3. 0x00007ffff436965b in __libc_message (do_abort=, fmt=) at ../sysdeps/unix/sysv/linux/libc_fatal.c:189

    4. 0x00007ffff43736d6 in malloc_printerr (action=3, str=0x7ffff4447780 "free(): corrupted unsorted chunks", ptr=) at malloc.c:6283

    5. 0x00007ffff4379ea3 in __libc_free (mem=) at malloc.c:3738

    6. 0x0000000000c21f71 in Epetra_BlockMapData::~Epetra_BlockMapData ( this=0x1461690, __in_chrg=) at /home/bartgol/LifeV/trilinos/trilinos-10.6.4-src/packages/epetra/src/Epetra_BlockMapData.cpp:110

To conclude, let me restate my doubt: what kind of errors can appear AT THE END of the destructor, even if ALL the attributes have been deleted already? Thanks again!

Patrizio Bertoni
  • 2,582
  • 31
  • 43
bartgol
  • 1,703
  • 2
  • 20
  • 30
  • How are you calling the destructor? – Sam I am says Reinstate Monica Jul 30 '12 at 17:47
  • And how were A_, B_, and C_ allocated? – Marlon Jul 30 '12 at 17:48
  • 2
    why dont use for memcheck tool like valgrind ? – Zaffy Jul 30 '12 at 17:49
  • 13
    `delete`ing a null pointer is perfectly valid (it's a NOP) and so there's no need for all those `if` statements. And why do you bother setting the deleted pointers to 0 in the destructor? And why, oh why is your class managing **3 raw pointers**? You'd be better off sticking each of those in a smart pointer (such as `unique_ptr`) – Praetorian Jul 30 '12 at 17:51
  • 2
    please post code which reproduces the problem. – Karoly Horvath Jul 30 '12 at 17:52
  • 8
    Do you copy `Data` objects around? In that case, you need to follow [The Rule of Three](http://stackoverflow.com/questions/4172722/). – fredoverflow Jul 30 '12 at 17:53
  • What does the stack trace say? Could it be that the segfault appears from any of the class member or base class destructors? – πάντα ῥεῖ Jul 30 '12 at 17:55
  • It would be exceptionally helpful if you could give us a complete, compilable example of the error occurring. It isn't possible to tell at this point what is actually going wrong. – Rook Jul 30 '12 at 18:00
  • What does the disassembly look like around the crashing instruction? What is the full call stack? Is the return address on the stack getting overwritten with garbage? – Adam Rosenfield Jul 30 '12 at 18:00
  • Is this class derived from any other? – kolenda Jul 30 '12 at 18:12
  • "at that point" compiler places all calls of class member destructors followed by destructors of base classes. So check them for possible errors. – xaizek Jul 30 '12 at 18:27
  • Thanks everybody! I added some information that hopefully will be helpful. – bartgol Jul 31 '12 at 14:12

3 Answers3

6

One problem that can cause a segfault at a function exit is heap or stack corruption.

It is possible that some other part of your program is causing problems. Something like double-destruction, or buffer overrun can cause memory corruption.

Often, debug builds of programs will include a check at function exit to ensure that the stack is intact. If it's not, well, you see the results.

Anthony
  • 12,177
  • 9
  • 69
  • 105
2

When the explicit body of the class destructor completes, it proceeds to perform some implicit actions: it calls base class and member destructors (in case you have base classes and members with non-trivial destructors) and, if necessary, it calls raw memory deallocation function operator delete (yes, in a typical implementation operator delete is actually called from inside the destructor). One of these two implicit processes caused the crash in your case, apparently. There's no way to say precisely without more information.

P.S. Stylistically the code is awful. Why are they checking for null before doing delete? What is the point of nulling deleted pointers in the destructor?

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Thanks for your reply. I didn't write that code, it's in a library I am using. ;-) The base class has a virtual destructor that does nothing (the base class just defines an interface). The attributes can be all deleted manually but the segfault would still happen at the moment when the destructor "returns". – bartgol Jul 31 '12 at 14:16
0

It's hard to tell from the scarce code you show. It could be easily that you already released resources one of your class members or your base class uses in it's own destructor.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
  • Thanks for your reply. That's what I thought. But I also tried to destroy *manually* all the attributes before the end of the destructor and the error always raises when the destructor "returns". – bartgol Jul 31 '12 at 14:14
  • @bartgol What do you mean with 'destroy manually'? Calling the destructors explicitely? That won't help, because in this case destructors of non dynamically allocated attributes will be called twice then. Destructor methods should never be called explicitely. – πάντα ῥεῖ Jul 31 '12 at 15:12
  • Uhm, that's true. But all I wanted to check was whether the problem was in the destruction of those attributes or somewhere else. Since I didn't see a segfault at the line where I explicitly called attr.~ItsType(), I argue that their destruction is not the problem here. Correct? – bartgol Jul 31 '12 at 15:21
  • @bartgol I guessed you did it for this reason. If no segfault appears there I'd also guess these aren't the reason, but it still could be you've done the destructor calls in a different order as the automatic destruction would do. Don't you have access to the sources of some of your attributes, or what's the reason you can't debug into these? As already recommended, did you have a look at the stack trace produced by the segfault? – πάντα ῥεῖ Jul 31 '12 at 15:32
  • I managed to solve the problem. There was a whole bunch of errors on top of the one I was discussing here, so I can't say what was the particular reason (or reasons) giving the segfault. I thank you anyway for your help and kindess; the topic can be closed here. Best – bartgol Jul 31 '12 at 19:05