3

I have a weird race condition in one of my programs which causes its crash only in release mode and OUTSIDE visual studio environment.

If I launch this process in release mode inside visual studio with F5 (either release or debug), it just works.

If I create a release copy with debug information it doesn't crash.

I'm wondering how would one debug such a problem.. and why isn't it crashing inside visual studio? Does visual studio slow down an executable even when launching the release version of it?

Sebastian
  • 4,802
  • 23
  • 48
Marco A.
  • 43,032
  • 26
  • 132
  • 246
  • 3
    Sounds like at some point, you're invoking undefined behavior – nikolas Aug 28 '13 at 08:58
  • I'm not even sure of the problem because I can't debug it and the application's code is too big to post it here – Marco A. Aug 28 '13 at 09:00
  • When you invoke UB, anything can happen, including what you observe. There are countless ways to invoke UB, of which no one will be willing to even start listing them here. – PlasmaHH Aug 28 '13 at 09:01
  • @DavidKernin I had this problem before, and it turned out in the end that MSVC, even in release mode, auto initializes variables, while this does not happen when not executing the application from the debugger. The problem there was an uninitialized pointer. But that's only one of a bajillion possible reasons – nikolas Aug 28 '13 at 09:02
  • 2
    If it only happens in release then it sounds like you are not initializing all your variables. Most compilers will zero initialize all uninitialized variables in debug mode but do nothing in release mode (basically making them random). Increase the warning level of your compiler and fix all the warnings (especially those to do with uninitialized variables). – Martin York Aug 28 '13 at 09:04
  • Make copy of your project and remove one by one features from it, to the point where problem disapears. Then check it the last removed feature is source of the problem. That would be my approach to this problem. – PiotrNycz Aug 28 '13 at 09:06
  • 4
    When you say "race condition" what precisely are you seeing that makes you suspect a race condition? – doctorlove Aug 28 '13 at 09:07
  • 1
    You should try and see if you can put your code through some static analysis. This should tell you all about uninitialized variables and other memory problems. – juanchopanza Aug 28 '13 at 09:10
  • What do you mean by can't debug it? You're code does not have any logs, that could help find place where it crashes (or at least reason why)? – zoska Aug 28 '13 at 09:21
  • thanks, but I already have the maximum warning level (it even throws errors for unused variables). @doctor I'm just guessing, might also be memory corruption but if I try to log (by outputting it to a file) the error disappears, so I think it's a timing-related issue – Marco A. Aug 28 '13 at 09:43
  • 3
    I don't understand why this is considered too broad. "I'm wondering how one would debug such a problem" seems a totally valid question. – Sebastian Aug 29 '13 at 12:48
  • Upon reflection, close as duplicate. Expanded on my own answer here: http://stackoverflow.com/a/18513077/214777 – Sebastian Aug 29 '13 at 14:06

3 Answers3

2

The question is really how to debug an application without changing the runtime behavior that causes the crash. The answer are better post-mortem diagnostics

You can improve your exception handling code and if this is a production application, you should.

  1. Install a custom termination handler using std::set_terminate

    If you want to debug this problem locally, you could run an endless loop inside the termination handler and output some text to the console to notify you that std::terminate has been called. Then attach the debugger and check the call stack.

    In a production application you might want to send an error report back home, ideally together with a small memory dump that allows you to analyze the problem.

  2. Microsoft has a structured exception handling mechanism that allows you to catch both hardware and software exceptions. See MSDN. You could guard parts of your code using SEH and use the same approach as in 1) to debug the problem. SEH gives more information about the exception that occurred that you could use when sending an error report from a production app.

If it really is a race condition then the right timing is crucial and I guess, attaching the debugger even in Release mode does change the behavior and thus the timing

Sebastian
  • 4,802
  • 23
  • 48
1

This answers "what is different", but may not be the full answer to why your code has a race condition in release mode.

One thing that changes is which heap is being used by the runtime, when you move to release outside of VS. It uses a debug heap inside VS even in release mode, as far as I understand.

Since heap allocations by definition has to be locked, using the debug heap (which fills the memory before it's given to the client code, and fills it again when the memory is freed) will block competing threads more often (causing more sequential execution), so you may find that this is part of the reason that the race happens.

If you set the environment variable _NO_DEBUG_HEAP=1 in your debug environment, (Configuration->Debugging->Environment Variables...), then you will get the same thing in the debugger.

Unfortunately, these sort of things can be quite tricky to debug. One thing I've found useful is to store an array of values for "where I've been" (the simpler the array, the better - so integer values, or a small string, or something), rather than printing something every time. If you can stop in the debugger or detect the crash in some way, you can then dump the "trace", and see how you got to where you are now, and what threads were involved.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
1

The key difference between F5 in Visual Studio and running a program alone is that Windows runs the program on a special Debug heap when the program is being initially run under debugger. The debug heap differs from ordinary one and in some cases this can lead to bugs manifesting themselves on ordinary heap only.

You could run the program and then attach the debugger and then it would use ordinary heap and your bug should happily reproduce. To prevent the program from "going too far" before you attach the debugger you can insert a call to Sleep() function inside the entry point.

sharptooth
  • 167,383
  • 100
  • 513
  • 979