9

I have quite massive program (>10k lines of C++ code). It works perfectly in debug mode or in release mode when launched from within Visual Studio, but the release mode binary usually crashes when launched manually from the command line (not always!!!).

The line with delete causes the crash:

bool Save(const short* data, unsigned int width, unsigned int height, 
          const wstring* implicit_path, const wstring* name = NULL, 
          bool enable_overlay = false)
{
    char* buf = new char[17];
    delete [] buf;
}

EDIT: Upon request expanded the example.

The "len" has length 16 in my test case. It doesn't matter, if I do something with the buf or not, it crashes on the delete.

EDIT: The application works fine without the delete [] line, but I suppose it leaks memory then (since the block is never unallocated). The buf in never used after the delete line. It also seems it does not crash with any other type than char. Now I am really confused.

The crash message is very unspecific (typical Windows "xyz.exe has stopped working"). When I click the "Debug the program" option, it enters VS, where the error is specified to be "Access violation writing location xxxxxxxx". It is unable to locate the place of the error though "No symbols were loaded for any stack frame".

I guess it is some pretty serious case of heap corruption, but how to debug this? What should I look for?

Thanks for help.

trincot
  • 317,000
  • 35
  • 244
  • 286
Matěj Zábský
  • 16,909
  • 15
  • 69
  • 114
  • Check if you are using the correct runtime libraries, use release builds of dependent libraries etc. Difficult to say what the exact reason is. Check if the pointer `but` is not deallocated in some other context (leading to a double free) or if you invoke UB somewhere before reaching the `delete []` call (index out of bounds). – dirkgently Feb 27 '10 at 14:02
  • This crashes even without me touching the buf pointer. I just allocate the space and the immediately delete it and it crashes. The buf is not touched after it is deleted. – Matěj Zábský Feb 27 '10 at 14:09
  • Does the code crash if you comment out those two lines? – dirkgently Feb 27 '10 at 14:11
  • No. It also does not crash if I comment out the second line. – Matěj Zábský Feb 27 '10 at 14:15
  • Do you use threads in program? – beermann Feb 27 '10 at 14:18
  • What compiler / linker are you using? GCC, Visual Studio (2005/2008?) - Depending on the compilier, you will have a few compile time options that may assist you in finding the code that causes your heap corruption. – NTDLS Feb 27 '10 at 14:21
  • Visual Studio 2008 Team ed. No threads. – Matěj Zábský Feb 27 '10 at 14:43
  • Does the code compile with zero warnings? – Martin York Feb 27 '10 at 15:38
  • Only one "oh, mbstowcs is unsafe use mbstowcs_s instead" and these: 1>LINK : warning LNK4224: /OPT:NOWIN98 is no longer supported; ignored 1>ggen.obj : warning LNK4075: ignoring '/EDITANDCONTINUE' due to '/OPT:ICF' specification 1>LINK : /LTCG specified but no code generation required; remove /LTCG from the link command line to improve linker performance – Matěj Zábský Feb 27 '10 at 15:42

9 Answers9

11

have you checked memory leaks elsewhere?

usually weird delete behavior is caused by the heap getting corrupted at one point, then much much later on, it becomes apparent because of another heap usage.

The difference between debug and release can be caused by the way windows allocate the heap in each context. For example in debug, the heap can be very sparse and the corruption doesn't affect anything right away.

Eric
  • 19,525
  • 19
  • 84
  • 147
5

The biggest difference between launched in debugger and launched on its own is that when an application is lunched from the debugger Windows provides a "debug heap", that is filled with the 0xBAADF00D pattern; note that this is not the debug heap provided by the CRT, which instead is filled with the 0xCD pattern (IIRC).

Here is one of the few mentions that Microsoft makes about this feature, and here you can find some links about it.

Also mentioned in that link is "starting a program and attaching to it with a debugger does NOT cause it to use the "special debug heap" to be used."

rogerdpack
  • 62,887
  • 36
  • 269
  • 388
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • Now it crashes inside debugger as well,but it is still unable to "Load symbols for any stack frames", so I am unable to debug it effectively. Thanks, at least some progress. – Matěj Zábský Feb 27 '10 at 14:39
  • Strange, usually it loads the symbols correctly. Try this: launch it without debugging from Visual Studio, then use the "Attach to process" command to connect the VS debugger to your application's process. In this way VS should load correctly the symbols of your application. If the crash happens inside an API call, trace it back to your code using the call stack window; in this case you may get some additional info of what's going on inside the OS installing the Windows debugging symbols. – Matteo Italia Feb 27 '10 at 15:48
  • I guess the problem is it is the Release build using /MT, it won't crash with /MTd – Matěj Zábský Feb 27 '10 at 16:42
  • The multi-thread *debug* CRT (/MTd) masks the problem, because, like Windows does with processes spawned by a debugger, it provides to your program a debug heap, that is initialized to the 0xCD pattern. Probably somewhere you use some uninitialized area of memory from the heap as a pointer and you dereference it; with the two debug heaps you get away with it for some reason (maybe because at address 0xbaadf00d and 0xcdcdcdcd there's valid allocated memory), but with the "normal" heap (which is often initialized to 0) you get an access violation, because you dereference a NULL pointer. – Matteo Italia Feb 27 '10 at 17:18
2

You probably have a memory overwrite somewhere and the delete[] is simply the first time it causes a problem. But the overwrite itself can be located in a totally different part of your program. The difficulty is finding the overwrite.

Add the following function

#include <malloc.h>

#define CHKHEAP()  (check_heap(__FILE__, __LINE__))

void check_heap(char *file, int line)
{
    static char *lastOkFile = "here";
    static int lastOkLine = 0;
    static int heapOK = 1;

    if (!heapOK) return;

    if (_heapchk() == _HEAPOK)
    {
        lastOkFile = file;
        lastOkLine = line;
       return;
    }

    heapOK = 0;
    printf("Heap corruption detected at %s (%d)\n", file, line);
    printf("Last OK at %s (%d)\n", lastOkFile, lastOkLine);
}

Now call CHKHEAP() frequently throughout your program and run again. It should show you the source file and line where the heap becomes corrupted and where it was OK for the last time.

Leo
  • 81
  • 1
  • 1
1

There are many possible causes of crashes. It's always difficult to locate them, especially when they differ from debug to release mode.

On the other hand, since you are using C++, you could get away by using a std::string instead of a manually allocated buffer >> there is a reason for which RAII exists ;)

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • I use std wstring everywhere possible, but in this place I need to pass non-unicode char array to one third party function. – Matěj Zábský Feb 27 '10 at 14:30
  • Are you sure that the third-party function does not `delete` in some cases ? Also, `std::string` has a `data()` member function which returns a `char*`. – Matthieu M. Feb 28 '10 at 12:39
1

It sounds like you have an unitialised variable somewhere in the code.

In debug mode all the memory is initialised to somthing standard so you will get consistant behavior.

In release mode the memory is not initialised unless you explicitly do somthing.

Run your compiler with the warnings set at the highest level possable.
Then make sure you code compiles with no warnings.

Martin York
  • 257,169
  • 86
  • 333
  • 562
0

One type of problem I had when I observed this symptom is that I had a multi-process program crash on me when run in shell, but ran flawlessly when called from valgrind or gdb. I discovered (much to my embarrassment), that I had a few stray processes of the same program still running in the system, causing a mq_send() call to return with error. The problem was that those stray processes were also assigned the message queue handle by the kernel/system and so the mq_send() in my newly spawned process(es) failed, but undeterministically (per the kernel scheduling circumstances).

Like I said, trivial, but until you find it out, you'll tear your hair out!

I learnt from this hard lesson, and my Makefile these days has all the appropriate commands to create a new build, and cleanup the old environment (including tearing down old message queues and shared memory and semaphores and such). This way, I don't forget to do something and have to get heartburn over a seemingly difficult (but clearly trivially solvable) problem. Here is a cut-and-paste from my latest project:

[Makefile]
all:
      ...
...

obj:
      ...
clean:
      ...
prep:
  @echo "\n!! ATTENTION !!!\n\n"
  @echo "First: Create and mount mqueues onto /dev/mqueue (Change for non ubuntu)"
  rm -rf /run/shm/*Pool /run/shm/sem.*;
  rm -rf /dev/mqueue/Test;
  rm -rf /dev/mqueue/*Task;
  killall multiProcessProject || true;
Sonny
  • 2,103
  • 1
  • 26
  • 34
0

These two are the first two lines in their function.

If you really mean that the way I interpret it, then the first line is declaring a local variable buf in one function, but the delete is deleting some different buf declared outside the second function.

Maybe you should show the two functions.

Steve Fallows
  • 6,274
  • 5
  • 47
  • 67
0

Have you tried simply isolating this with the same build file but code based just on what you've put above? Something like:

int main(int argc, char* argv[] )
{
    const int len( 16 );
    char* buf = new char[len + 1]; 

    delete [] buf;
}

The code you've given is absolutely fine and, on it's own, should run with no problems either in debug or optimised. So if the problem isn't down to specifics of your code, then it must be down to specifics of the project (i.e. compilation / linkage)

Have you tried creating a brand new project and placing the 10K+ lines of C++ into it? Might not take too long to prove the point. Especially if the existing project has either been imported in or heavily altered.

Component 10
  • 10,247
  • 7
  • 47
  • 64
  • just a thought but have you tried placing some debug output before and after the delete? It seems from what you say that you've identified the delete as the source of the problem but the error seems unclear about where the error actually happens. It may be that the delete itself is fine but something then attempts to access that memory after the delete. It's also generally good practice to set buf to 0 after deleting it to prevent double delete problems and to make it easy to test if the pointer is valid or not. – Component 10 Feb 27 '10 at 20:49
0

I was having the same issue, and I figured out that my program was only crashing when I went to delete[] char pointers with a string length of 1.

void DeleteCharArray(char* array){
 if(strlen(array)>1){delete [] array;}
 else{delete array;}
}

This fixed the issue, but it is still error prone, but could be modified to be otherwise. Anyhow the reason this happens I suspect is that to C++ char* str=new char[1] and char* str=new char; are the same thing, and that means that when you're trying to delete a pointer with delete[] which is made for arrays only then results are unexpected, and often fatal.

  • 1
    I think you checked all the code that executes before the delete for out-of-bounds writing (writing behind end of an array)? These errors are usually result of that. Deleting char arrays with length 1 is just fine. – Matěj Zábský Sep 18 '11 at 14:58