23

I have a long-running .NET 4.5 application that crashes randomly, leaving the message I've mentioned in the question title in the event log. The issue is reproduced on 3 different machines and 2 different systems (2008 R2 and 2012). Application doesn't use any unsafe/unmanaged components, it's pure managed .NET, with the only unmanaged thing being the CLR itself.

Here's the stack trace of the crash site that I've extracted from the dump:

clr.dll!MethodTable::GetCanonicalMethodTable()  
clr.dll!SVR::CFinalize::ScanForFinalization()  - 0x1a31b bytes  
clr.dll!SVR::gc_heap::mark_phase()  + 0x328 bytes   
clr.dll!SVR::gc_heap::gc1()  + 0x95 bytes   
clr.dll!SVR::gc_heap::garbage_collect()  + 0x16e bytes  
clr.dll!SVR::gc_heap::gc_thread_function()  + 0x3e bytes    
clr.dll!SVR::gc_heap::gc_thread_stub()  + 0x77 bytes    
kernel32.dll!BaseThreadInitThunk()  + 0x1a bytes    
ntdll.dll!RtlUserThreadStart()  + 0x21 bytes    

This issue closely resembles the one that was discussed here, so I tried the solutions suggested in that topic, but none of them helped:

  • I've tried installing this hotfix, but it won't install on any of my machines (KB2640103 does not apply, or is blocked by another condition on your computer), which actually makes sense, because I'm using 4.5, not 4.0.

  • I've tried disabling concurrent GC and/or enabling server GC. Right now the relevant part of my app.config looks like this:

    <?xml version="1.0"?>
    <configuration>        
        <runtime>
            <gcConcurrent enabled="false"/>
            <gcServer enabled="true" />
        </runtime>
    <startup><supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5"/>    </startup></configuration>
    

Though the weird thing is I still find multiple GC-related threads in the process dump. Besides the one the crash occurs in, there are 7 threads with the following stack trace:

ntdll.dll!NtWaitForSingleObject()  + 0xa bytes  
KERNELBASE.dll!WaitForSingleObjectEx()  + 0x9a bytes    
clr.dll!CLREventBase::WaitEx()  + 0x13f bytes   
clr.dll!CLREventBase::WaitEx()  + 0xf7 bytes    
clr.dll!CLREventBase::WaitEx()  + 0x78 bytes    
clr.dll!SVR::t_join::join()  + 0xd8 bytes   
clr.dll!SVR::gc_heap::scan_dependent_handles()  + 0x65 bytes    
clr.dll!SVR::gc_heap::mark_phase()  + 0x347 bytes   
clr.dll!SVR::gc_heap::gc1()  + 0x95 bytes   
clr.dll!SVR::gc_heap::garbage_collect()  + 0x16e bytes  
clr.dll!SVR::gc_heap::gc_thread_function()  + 0x3e bytes    
clr.dll!SVR::gc_heap::gc_thread_stub()  + 0x77 bytes    
kernel32.dll!BaseThreadInitThunk()  + 0x1a bytes    
ntdll.dll!RtlUserThreadStart()  + 0x21 bytes    

Which makes me wondering if I could somehow screw up disabling the concurrent GC (that's what I actually listed the config for).

I think that wraps up what I've managed to find so far. I could really use some help on how to proceed with dealing with this issue.

Community
  • 1
  • 1
HellBrick
  • 557
  • 1
  • 6
  • 10
  • 4
    The object header of a managed object on the GC heap got corrupted, it can't find the method table of the type anymore. You always first look for unmanaged code you interop with to look for a reason. Tinkering with the gc config doesn't fix the problem. – Hans Passant Oct 03 '13 at 11:37
  • Maybe a problem in a finalizer? You could try setting breakpoints in finalizers or commenting them out. – DSway Oct 04 '13 at 19:16
  • `scan_dependent_handles`: dependent handles were added recently to the CLR (4.0?). Maybe it is a genuine bug in the CLR. – usr Oct 07 '13 at 11:35
  • @HellBrickAK, did you ever find a solution? I am stuck with a very similar problem. – zaitsman Mar 08 '14 at 23:27
  • Unfortunately no. I didn't have enough time to investigate this issue any further, so I had to revert the feature responsible for it. I've re-implemented it from scratch recently, and it seems to work fine so far, but I still fail to grasp what I did wrong in the first attempt. – HellBrick Mar 10 '14 at 07:29
  • I have exactly the same problem and symptoms as you. For me, it appears that the issue originates from the new `RunAndCollect` (Reflection.Emit) assemblies; when I just use `Run` or `RunAndSave` mode, it all works fine. The relevant link can be found at http://msdn.microsoft.com/en-us/library/system.reflection.emit.assemblybuilderaccess(v=vs.110).aspx . – atlaste Mar 31 '14 at 16:06
  • Just made a Microsoft Connect bug report @ https://connect.microsoft.com/VisualStudio/feedback/details/844183/net-runtime-crashes-with-error-80131506-when-using-assemblybuilderaccess-runandcollect . Note that I can reproduce the issue here. – atlaste Mar 31 '14 at 16:23

7 Answers7

6

I am drawing from my past experience in our application. This could be caused if an exception goes unhandled till the Finalizer level, and if it goes... it will crash the application.

Before doing anything on the GC configuration..

One quick check... Are you using task parallel libraries?. If yes make sure you are handling exceptions properly. If exceptions from different threads are left unhandled it goes till Finalizer which then crashes the application. There are couple of ways to handle them neatly. Handling 'Aggregate' Exception is one way (that we used to solve!).

http://msdn.microsoft.com/en-us/library/dd537614.aspx

I don't have 50 points to add a comment, so adding it as an answer...

SridharVenkat
  • 656
  • 8
  • 21
  • The issue indeed started occurring after I've enabled a component that's using TPL quite actively, but I don't think unhandled exceptions are at fault here. The reasons are: 1. All callbacks executed on tasks are wrapped in try-catch blocks; 2. I've subscribed to AppDomain.Current.UnhandledException, and AFAIR it's triggered in this task exception + finalizer case; 3. I don't see how it could possibly corrupt a managed heap, which is what seems to be happening here. – HellBrick Oct 08 '13 at 10:37
  • 1) Are you saying AppDomain.Current.UnhandledException is triggered? that means something left unhandled, log and get more data here. 2) Exceptions at Finalizer are fatal. 3) In your dump analysis '!threads' and check for Finalizer thread and !pe you should see the exception. If that's the case :).. Let me know.. – SridharVenkat Oct 08 '13 at 11:32
  • I meant I have an AppDomain.Current.UnhandledException handler, but it is NOT triggered in my app, even though it should have been if it was a simple finalizer exception (I've just double-checked this by the following test app: [http://pastebin.com/9EgzBZQA](http://pastebin.com/9EgzBZQA)). Or are the unhandled task exceptions propagated in some other way that doesn't include throwing them from the finalizer? About the dump exploration suggestions: I'll try them later, first I need to research what did you even mean by them =) (This whole dump thing is kind of new to me) – HellBrick Oct 09 '13 at 06:39
  • You can also get unhandled exceptions if you fail to examine the Result property of a task. Don't forget you must also attach something to `TaskScheduler.UnobservedTaskException` to avoid process teardown. We only found about about the former by handling the latter. – escape-llc May 23 '14 at 11:28
1

I realize this is an old post, however, I ran into the same issue as the OP. The point atlaste made:

Change the runtime to x86 or x64 and try again; you can also mess with the concurrent GC settings like you already tried.

Was the key for me. All of my projects were set to Any CPU except for one (coincidentally the entry point for the application which is a Console Application project). This project was set to x86. Once I changed it to Any CPU the application ran correctly.

1

My problem is weird byt every 5-10 minutes my application pool kept crashing with this exit code (80131506). I'm not sure in high threaded operation / schedule task you should thrust the Garbage Collector, but the following solution worked for here.

I added a Job that calls GC.GetTotalMemory(true) every minute. I assume that, for some reason, the GC is not automatically calling the Garbage Collector often enough for the high number of disposable objects that I use. But this fix my problem! It's more like a quick fix than a final solution ;)

0

Solution that helped me: uninstall .NET 4.5.1, install 4.0, install mentioned hotfix, install 4.5.1 back.

Genrih
  • 52
  • 6
0

I just finished a conversation with Microsoft since I have been able to reproduce an issue which is similar.

In my case it was a bug in the .NET runtime, which has to do with mixing dynamic types and non-dynamic code. I'm not sure if this is also the case in your scenario, but there are some thing you might want to try:

  • Run the code on Windows 8.1 (latest updates). Apparently Windows 8.1 has a more recent version of .NET than the other versions of Windows.
  • If you use AssemblyBuilder (like I did), try to change it to Run mode instead of RunAndCollect.
  • Change the runtime to x86 or x64 and try again; you can also mess with the concurrent GC settings like you already tried.
  • My bug is being fixed as we speak, which basically means there'll be a windows update that took care of it. Perhaps it's also an option to simply wait for that; I don't expect that to take too long, since it's quite critical for a lot of programs.
atlaste
  • 30,418
  • 3
  • 57
  • 87
0

I finally found a fix I could install. I also have 4.5 and other fix for 4.0 was not being installed. Removing 4.5 did not fix it either. Fix in the link actually fixed it.

http://kb.machsol.com/Knowledgebase/Article/50305

acheron55
  • 5,429
  • 2
  • 23
  • 22
0

We've had the same problem in our .NET 4.5 desktop app - web scraper. It crashed randomly under heavy load. So we've been searching for ways to find out what was the cause for a few months: we've tried everything! Disabling concurrent GC, setting it to Server mode and many-many other workarounds, until we've realized that the crashes occured because of the PhantomJS module. It uses some unmanaged resources and doesn't clear them afterwards :( So we've created a stand-alone console app for PhantomJS integration. Now we execute this console app with Process.Start from out web scraper and kill it afterwards. It takes more time for scraping, but no more crashes!

Daniel Vygolov
  • 884
  • 2
  • 13
  • 26