6

On Windows 7 x64, when I attach in x86 mode to a fairly complex free-running app, it runs for a while, then reproducibly exits.

MyApp.exe Managed (v4.0.30319)' has exited with code -1073740791 (0xc0000409).

followed immediately by

MyApp.vshost.exe: Managed (v4.0.30319)' has exited with code 0 (0x0).

Sometimes if it runs OK, it would hit my breakpoint, I'll inspect the state, but when I hit F5 to keep going, the app exits in the same fashion.

Quick search for the error code tells me that it's a Stack Buffer Overrun. I hear that it might be caused by incorrect unmanaged interop code.

I can run from debugger OK (F5), but free-running and attaching always has this problem.

Any thoughts on how I could narrow it down?

EDIT: Here's a callstack i am seeing on a different machine (Windows Server 2008 R2 x64) here, might be related:

clr.dll!__crt_debugger_hook()
clr.dll!___report_gsfailure() + 0xeb bytes clr.dll!_DoJITFailFast@0() + 0x8 bytes clr.dll!CrawlFrame::SetCurGSCookie() + 0x2e9c4f bytes
clr.dll!StackFrameIterator::Init() + 0x60 bytes
clr.dll!Thread::StackWalkFramesEx() + 0x8a bytes
clr.dll!Thread::StackWalkFrames() + 0x87 bytes clr.dll!CNameSpace::GcScanRoots() + 0xd7 bytes clr.dll!WKS::gc_heap::mark_phase() + 0xae bytes
clr.dll!WKS::gc_heap::gc1() + 0x7b bytes
clr.dll!WKS::gc_heap::garbage_collect() + 0x1c1 bytes
clr.dll!WKS::GCHeap::GarbageCollectGeneration() + 0xba bytes
clr.dll!WKS::gc_heap::try_allocate_more_space() + 0x1cd0 bytes clr.dll!WKS::gc_heap::allocate_more_space() + 0x13 bytes
clr.dll!WKS::GCHeap::Alloc() + 0x507 bytes clr.dll!Alloc() + 0x5a bytes
clr.dll!SlowAllocateString() + 0x41 bytes
clr.dll!UnframedAllocateString() + 0x11 bytes
clr.dll!StringObject::NewString() + 0x26 bytes clr.dll!Int64ToDecStr() + 0x12e bytes
clr.dll!COMNumber::FormatInt64() + 0x17e bytes mscorlib.ni.dll!6c60b8e1()
[Frames below may be incorrect and/or missing, no symbols loaded for mscorlib.ni.dll]

EDIT2 Things seem fine on x64 build of the app, issue only appears in x86.

GregC
  • 7,737
  • 2
  • 53
  • 67
  • You might find more information in your OS Event Viewer? – Jason Williams Jun 11 '11 at 21:23
  • @Jason Williams: nothing in the event viewer. Good try – GregC Jun 11 '11 at 21:27
  • Any access over UNC paths or to protected locations in the code? does running as a privileged user negate the problem? – Mike Miller Jun 11 '11 at 21:44
  • @Mike Miller: running on a dev box, full privileges, local access only. From the error code, it sounds like something is trashing my stack. Prolly bad P/Invoke somewhere. – GregC Jun 11 '11 at 21:46
  • good luck and please update with results as I'm now more interested than when Bouncer had his own episode in Neighbours (Bouncers Dream) - that rocked. – Mike Miller Jun 11 '11 at 21:52
  • I am able to inspect state by taking a ProcDump of a running process that has hit an assertion, then opening the full memory dump in Visual Studio 2010 SP1. I still cannot continue from a breakpoint or from an assertion. – GregC Sep 26 '11 at 00:04

3 Answers3

4

From the Windows SDK ntstatus.h header file:

//
// MessageId: STATUS_STACK_BUFFER_OVERRUN
//
// MessageText:
//
// The system detected an overrun of a stack-based buffer in this application. This overrun 
// could potentially allow a malicious user to gain control of this application.
//
#define STATUS_STACK_BUFFER_OVERRUN      ((NTSTATUS)0xC0000409L)    // winnt

A buffer overrun on a stack allocated buffer is an infamous virus injection vector. Microsoft got very serious about eliminating that potential thread in their code. The C and C++ languages were first. Managed code straggled behind, this is not something that is supposed to happen in a managed execution environment.

Nevertheless, the version 4 CLR was built with the protection in place, unlike earlier CLR versions. And it does its job, although it is exceedingly rare for it to happen. I've seen a question about it only once before.

Solving this problem is going to be difficult, especially when you have no obvious lead to what unmanaged code in your application might be tripping this protection. Best thing to do is to make a minimal repro and contact Microsoft Support to show them what is going wrong. Finding out what trips it while working on getting the repro is a likely outcome.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Behavior seems different on different platforms. I see GSCookie on Server 2008 R2 x64, but no such message on Windows 7 x64. – GregC Jul 29 '11 at 14:45
  • I accepted this answer because it stated that C/C++ code is the likely culprit, and it actually was, in my case. – GregC May 10 '16 at 14:04
1

is the interop signature correct? try using http://clrinterop.codeplex.com/releases/view/14120 to generate it and try again.

Mike Miller
  • 16,195
  • 1
  • 20
  • 27
  • did I mention this is a very complex application? We have 185 DllImport statements. – GregC Jun 11 '11 at 21:31
  • Yesterday I was removing layers of my app to see what component caused it. I found out that the issue is not with P/Invoke signature, but rather in a precompiled module written in managed C++, targeting .NET 4.0. When calling into Managed C++, there's no P/Invoke required. It just works. Or not, in my case – GregC Jul 29 '11 at 14:44
1

It looks like I should be able to narrow it down by injecting GC.Collect() calls in my code: garbage collection checks GSCookie among other things.

Broken link1: http://7388.info/index.php/article/studio/2010-10-17/354.html

Broken link2: http://www.pubsub.com/Investigating-a-GSCookie-Corruption_Windows-NET-Troubleshooting-PInvoke-5wbEHu80dzF,rZ5U5DaVJaE

GregC
  • 7,737
  • 2
  • 53
  • 67
  • I am currently running with Jinx, and unable to force the code path to fail. I am still unable to attach to a free-running app, as it immediately exits. – GregC Jul 26 '11 at 18:56
  • In retrospect, I feel it's one of the unmanaged libraries that we were relying on. Once it's compiled out, no more problem. – GregC Apr 30 '12 at 22:20
  • The link http://7388.info/index.php/article/studio/2010-10-17/354.html in this post currently produces a site temporarily unavailable message http://p3nlhclust404.shr.prod.phx3.secureserver.net/SharedContent/redirect_1.html –  May 31 '12 at 03:58
  • @SteveMylroie I looked at the way-back machine. Nothing in the archives. – GregC May 31 '12 at 14:04
  • Despite both of these links now being broken, this comment helped me figure out a similar problem (indeed with a P/Invoke call). I was getting it whenever an exception was thrown, but forcing GC made it possible to pinpoint the actual bug. +1 – ThFabba Dec 14 '17 at 16:08
  • Sorry about the links, glad it helped. Unlike the universe, the internet is shrinking – GregC Dec 14 '17 at 19:36