28

I am investigating a crash due to heap corruption. As this issue is non-trivial and involves analyzing the stack and dump results, I have decided to do a code review of files related to the crash.

To be frank, I don't have in-depth knowledge of when the heap could be corrupted.

I would appreciate if you could suggest scenarios which could lead to heap corruption.

Platform: Windows XP

Language: C++

Compiler: VC6

arjuncc
  • 3,227
  • 5
  • 42
  • 77
Satbir
  • 6,358
  • 6
  • 37
  • 52
  • 1
    hmm, might be a bit general...suggest you provide more (well some) detailed info... – Mitch Wheat Oct 01 '09 at 14:23
  • There are more than 10 file involve, I wanted to know programming scenarios which lead to heap corruption , I will appreciate if some one provide sample of code, ~2-3 lines :) – Satbir Oct 01 '09 at 14:27
  • My intention is to filter out coding which may lead to heap corruption – Satbir Oct 01 '09 at 14:28
  • 1
    Your question lacks a lot of details. Can you specify the platform and the context please? – Coincoin Oct 01 '09 at 14:30
  • 1
    I've used Compuware's BoundsChecker to successfully track down some weird memory errors. It's not free, but I've found it incredibly useful, especially for tracking down some really obscure errors. – Herms Oct 01 '09 at 15:44
  • 2
    It's impossible to provide code samples for this. You might as well ask "what does code which doesn't compile look like". There's a literally infinite number of possibilities. The key to programming C++ (or any other language) is to stick within the things *allowed* by the language. You can't go the opposite way, and try to compile a list of "as long as I don't do these things, my code is valid C++" – jalf Oct 01 '09 at 16:59
  • 3
    Mostly if you have sections of code that seem to be suspicious for corrupting the heap, you could insert _heapchk (see http://msdn.microsoft.com/en-us/library/aa298379%28VS.60%29.aspx ) to check if the heap is still OK or already corrupted. – mmmmmmmm Oct 01 '09 at 17:46

14 Answers14

54

Common scenarios include:

  • Writing outside the allocated space of an array (char *stuff = new char[10]; stuff[10] = 3;)
  • Casting to the wrong type
  • Uninitialized pointers
  • Typo error for -> and .
  • Typo error when using * and & (or multiple of either)

[EDIT] From the comments, a few more:

  • Mixing new [] and new with delete [] and delete
  • Missing or incorrect copy-constructors
  • Pointer pointing to garbage
  • Calling delete multiple times on the same data
  • Polymorphic baseclasses without virtual destructors
cwap
  • 11,087
  • 8
  • 47
  • 61
  • Using pointers pointing to garbage. I had a rather insidious one. const WCHAR * cstr = ICUString.c++str().c_str(); The syntax is wrong, but basically, I went from ICUString to std::wstring to const WCHAR *, but since the std::wstring is a anonymous object, it's deleted after the line's done, thus my pointer's pointing to garbage. Accessing it led to corruption and crash. – Calyth Oct 01 '09 at 15:05
  • Don't forget calling delete on two different pointers to the same object on the heap – Graphics Noob Oct 01 '09 at 16:10
  • Add to that: Passing random/already deleted pointers to `delete`, mixing `new`/`new[]` with `delete`/`delete[]`, missing or incorrect copy ctors/assignment operators, polymorphically used base classes with non-virtual dtors... This list is _very_ long. Basically it boils down to what I have summarized here: http://stackoverflow.com/questions/1346583/1346631#1346631 – sbi Oct 01 '09 at 16:11
  • All great additions. Ill add them to the list :) – cwap Oct 01 '09 at 16:15
  • accessing already freed pointers to data, calling a member function on a deleted object, using invalid iterators to stl containers, ... VC has a debug heap implementation that can find some of the issues in your list. – Hans Oct 01 '09 at 17:09
  • My understanding was using the delete keyword on a pointer that was already deleted did nothing. Is there something special about the second pointer to the same object? I have Stroustup's book (2nd ed.) if you have a page reference. (Don't forget to add it to the answer too) – Kelly S. French Oct 01 '09 at 18:16
  • Well, just tried deleting the same data 2 times in VC++, and it crashed with an assertion exception in debug mode, which makes me think that it's not a good idead to do :) I guess you potentially delete whatever data might be reallocated into that address-space later on. – cwap Oct 01 '09 at 18:51
  • 1
    @Kelly: It's passing `NULL` pointers to `delete` which is doing nothing. Passing the same pointer to `delete` twice is a fatal error leading to what's called undefined behavior. Usually this invokes Nasty Nasal Demons on you. – sbi Oct 04 '09 at 22:32
19

Welcome to hell. There is no easy solution so I will only provide some pointers.

Try to reproduce the bug in a debug environement. Debuggers can pad your heap allocations with bound checks and will tell you if you wrote in those bound checks. Also, it will consistently allocate memory using the same virtual addresses, making reproductibility easier.

In that case, you can try an analyser tool such as Purify. They will detect pretty much anything nasty that your code is doing but will also run VERY slowly. Such a tool will detect out of bound memory access, freed memory access, trying to free twice the same block, using the wrong allocator/deallocators, etc... Those are all kind of conditions that can stay latent for very long and only crash at the most inopportune moment.

Coincoin
  • 27,880
  • 7
  • 55
  • 76
  • sadly, This crash is not easy to produce,only happens once in almost million time – Satbir Oct 01 '09 at 14:33
  • 2
    I understand. However, even if the bug manifests itself by a crash one in a millionth time, it might still be detectable pretty easily by an analyser tool. – Coincoin Oct 01 '09 at 14:48
6

You can look at the sample chapter from the Advanced Windows Debugging book which provides various examples of heap corruption.

EDIT: If you are using stl containers which might move elements during changes (i.e. vector, deque), ensure that you are not keeping references into such elements across operations which might changes it.

Komat
  • 315
  • 1
  • 4
6

There are products that will observe the memory allocations and deallocations, and produce a report on anomalies. Last I used one, they weren't cheap, but I don't know what the situation is right now. However, finding stuff for VC++ 6 might be a problem.

Remember that you're liable to be getting heap corruption a lot more often than you're going to crash, so be attentive to the problem reports, and fix all heap corruption.

I'd suggest using std::tr1::smart_ptr<> instead of raw pointers everywhere, but I'm not sure VC++ 6 is going to support that.

Why are you still on VC++ 6? Would it be practical to upgrade? The tools are better with the more recent versions, and they fully support smart pointers.

David Thornley
  • 56,304
  • 9
  • 91
  • 158
  • Thanks David, Your comments are valueble ,Migrating from VC++ 6 is a big decision , Its a legacy code so there a few change possible , Yes in future i would like to use smart pointer :) thanks – Satbir Oct 01 '09 at 14:55
4

Its a common mistake to free() or delete allocated memory more than one. It may help to insert something like *var = NULL after such calls, and to check for != NULL when calling free. Although in C++ its legal to call delete with a NULL variable, calling C - free() will fail.

Also a common problem is to confuse delete and delete [].

Variables allocated with new must be released with delete.

Array allocated with new [] must be released with delete[].

Also make sure not to mix C- style memory management (malloc, calloc, free) with C++ style memory management (new/delete). In legacy code often both are mixed, but things allocated with the one can not be freed with the other.

All of these errors will usually not be recognized by a compiler.

RED SOFT ADAIR
  • 12,032
  • 10
  • 54
  • 92
  • thanks StefanWoe one basic Question , if is do new[] but use delete, rather than delete, how it will lead to heap corruption , Can rest of memory re-allocated ? – Satbir Oct 01 '09 at 14:49
  • If something is corrupted, it makes no sense to re-allocate. You are lost. – RED SOFT ADAIR Oct 01 '09 at 14:57
  • its my bad english : ) , I mean to say if i initialize char *ptr=new char[100] , and do a delete ptr, will it free first char only..if yes what will happen to rest of 99 – Satbir Oct 01 '09 at 15:06
  • The remaining 99 bytes will be lost meaning there will be a memory leak of 99 byes. – VNarasimhaM Oct 01 '09 at 15:09
  • 1
    What will happen in that case is explicitly undefined by the specification. Most likely result is that the destructor is called for first element in the array and entire block of the memory is released. However crash, leak or heap corruption might be also possible. – Komat Oct 01 '09 at 15:45
  • 2
    The behavior of calling "new[]" and then "delete" is not defined. It might free only the first element, crash the program, cook coffee whatever. – RED SOFT ADAIR Oct 01 '09 at 18:52
4

Every time you do something which isn't defined in the language standard, it is undefined behavior, and one of the ways in which it might manifest itself is through heap corruption. There are about three million ways to do this in C++, so it's really impossible to say.

A few common cases are double-freeing dynamically allocated memory, or writing outside the bounds of an array. Or writing to an uninitialized pointer.

Recent versions of Microsoft's compiler add an /analyze compiler switch which performs a bunch of static analysis to catch such errors. On Linux, valgrind is an obvious choice.

Of course, you are using VC6 which has been unsupported for years, and which has a number of known bugs, resulting in invalid code being generated.

If possible, you should upgrade to a proper compiler.

jalf
  • 243,077
  • 51
  • 345
  • 550
3

Check out the answers to this related question.

The answer I suggested provides a technique which may be able to get you back to the code that is actually causing the heap corruption. My answer describes the technique using gdb but I'm sure you must be able to do something similar on windows.

The principle at least should be the same.

Community
  • 1
  • 1
Richard Corden
  • 21,389
  • 8
  • 58
  • 85
3

An additional debugging tip is to look at the values that are being written to the wrong place using the raw memory view. Is it writing zeros... strings... some other recognizable number pattern? If you can see the corrupted memory at some point after the error occurs, that can provide a hint of the code that caused it.

Also always set pointers to null after deleting them even in destructors. Sometimes unexpected things can be triggered during a destructor chain in a base class than cause an access to a partially deleted sub-class.

Alan
  • 4,897
  • 2
  • 24
  • 17
2

During freeing heap memory, first child memory block need to be deleted and then parant memory block otherwise the child momory block would be leaked parmanently which causes the crash after running the tool millions of times.. ex:

constQ= new double* [num_equations];
for(int i=0;i<num_equations;i++)
{
constQ[i]=new double[num_equations];
for(int j=0;j<num_equations;j++)
{
constQ[i][j]=0.0;
}
.
.
.

//Deleting/Freeing memory block 
//Here the below only parent memory block is deleted, the child memory block is leaked.

if(constQ!=NULL)
{
delete[] constQ;
constQ=NULL
} 
//Correct way of deleting heap memory..First delet child block memory and then Parent block

if(constQ!=NULL)
{
for(int i=0; i <num_equations;i++)
{
delete[] constQ[i];
delete[] constQ;
constQ=NULL
}
Red
  • 557
  • 6
  • 7
2

The most difficult memory corruption bug I've run into involved (1) calling a function in a DLL that returned a std::vector and then (2) letting that std::vector fall out of scope (which is basically the whole point of std::vector). Unfortunately it turned out that the DLL was linked to one version of the C++ runtime, and the program was linked to another; which meant that the library was calling one version of new[] and I was calling a completely different version of delete[].

That is not what's happening here, because that failed every time and according to one of your comments "the bug manifests itself by a crash one in a millionth time." I would guess that there's an if statement that gets taken once in a million times and it causes a double delete bug.

I recently used evaluation versions of two products that may help you: IBM's Rational Purify and Intel Parallel Inspector. I'm sure there are others (Insure++ is mentioned a lot). On Linux you would use Valgrind.

Max Lybbert
  • 19,717
  • 4
  • 46
  • 69
  • 1
    Even your `std::vector` bug might, under some circumstances, only lead to a crash once in a million times the vector passes its memory to the wrong heap for deletion. It's the nature of undefined behavior that its outcome is, well, _undefined_. – sbi Oct 04 '09 at 22:36
1

If you have access to a *nix machine, you can use Valgrind.

Paul Nathan
  • 39,638
  • 28
  • 112
  • 212
0

have you thought isolating the source of the corruption using gflags? Once you have a dump (or breaking debugger -> WinDBG) you could see where the corruption is caused more precisely.

Here is some gflag examples: http://blogs.msdn.com/b/webdav_101/archive/2010/06/22/detecting-heap-corruption-using-gflags-and-dumps.aspx

Cheers, Seb

sedau79
  • 21
  • 5
0

These are the HeapAlloc fuction syntax.

LPVOID WINAPI HeapAlloc(
  _In_ HANDLE hHeap,
  _In_ DWORD  dwFlags,
  _In_ SIZE_T dwBytes
);

Here dwFlags paramater can have either HEAP_GENERATE_EXCEPTIONS or HEAP_NO_SERIALIZE or HEAP_ZERO_MEMORY.

In our file we have to check the flags which we have set. If we have set the flag value as HEAP_NO_SERIALIZE then there will be no serialization which means multiple thread will access the resources which may cause memory corruption.

"Setting the HEAP_NO_SERIALIZE value eliminates mutual exclusion on the heap. Without serialization, two or more threads that use the same heap handle might attempt to allocate or free memory simultaneously, likely causing corruption in the heap."

so I think due to the memory corruption in the heap, the node got crashed.

Constantin
  • 8,721
  • 13
  • 75
  • 126
-1

Try this (tested in Visual Studio MBCS build):

length = 10
element = length/sizeof(wchar_t) + 1
wchar_t* string = new wchar_t[length];
string[element] = L'\0';

Assigning element to any other value < length usually ends up with messed up memory, but >= length is instant heap corruption. In the latter case, the "Abort, Retry, Ignore" notification points to the line with the wchar_t assignment rather than the faulting L'\0' assignment.

In this case, prefer to discard the null assignment in favour of value initialization:

wchar_t* string = new wchar_t[length]();
Laurie Stearn
  • 959
  • 1
  • 13
  • 34