1

I'm trying to figure out why we hit a System.OutOfMemoryException in our server application. The machine has 256Gb memory and 48 cores, and the exception happens when it's using around 50Gb - 70Gb, so plenty of headroom there.

I have a dump, but unfortunately only after the stack has been unwinded. Using WinDbg and this command:

.foreach(tempVariable {!dumpheap -type System.OutOfMemoryException -short}){!pe -nested tempVariable;.echo *************}

I can see there are a lot of exceptions on the heap of this type:

Exception object: 0000009034f610d0
Exception type:   System.OutOfMemoryException
Message:          <none>
InnerException:   <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 8007000e

Nested exception -------------------------------------------------------------
Exception object: 00000097711dca00
Exception type:   System.OutOfMemoryException
Message:          <none>
InnerException:   <none>
StackTrace (generated):
    SP               IP               Function
    0000009FF5115F70 0000000000000000 mscorlib_ni!System.Environment.GetResourceFromDefault(System.String)+0x1
    0000009FF5115F70 00007FF83630240E mscorlib_ni!System.Environment.GetResourceString(System.String, System.Object[])+0xe
    0000009FF5115FB0 00007FF7DA46CCAC UNKNOWN!common2.taskservice.SimpleLockingTaskQueue.RunReadonlyEvent(TaskQueueEntry)+0xfbc
    0000009FF511E7E0 00007FF7DA46BCD1 UNKNOWN!common2.taskservice.SimpleLockingTaskQueue+<>c__DisplayClass31_0.<MainQueueWorkerThread>b__1()+0x31
    0000009FF511E810 00007FF83638D436 mscorlib_ni!System.Threading.Tasks.Task.Execute()+0x46

StackTraceString: <none>
HResult: 8007000e
*************
Exception object: 0000009056032ef0
Exception type:   System.OutOfMemoryException
Message:          <none>
InnerException:   <none>
StackTrace (generated):
    SP               IP               Function
    0000009F533FE430 0000000000000000 mscorlib_ni!System.Threading.Monitor.Enter(System.Object)+0x1
    0000009F533FE430 00007FF7DA361FD0 UNKNOWN!common2.objectevents.RequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].EventCall(System.Action)+0xa0
    0000009F533FE7C0 00007FF7DA609774 UNKNOWN!common2.objectevents.RequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].<ExecuteTail>b__56_0()+0x44
    0000009F533FE7F0 00007FF7DA46BEBB UNKNOWN!common2.taskservice.SimpleLockingTaskQueue.RunReadonlyEvent(TaskQueueEntry)+0x1cb

So almost all of the OutOfMemoryException are within the GetResourceFromDefault or Monitor.Enter(). I guess the first handler is trying to allocate memory to display the locale version of the exception? But what about the Monitor.Enter()?

The UNKNOWN module is of course our application..

Anyone has any clues on how to figure out what exactly could be causing the exception?

Edit: More info

!gchandles
...
Statistics:
              MT    Count    TotalSize Class Name
00007ff83993f3b0        1           40 System.Xml.Linq.XNamespace
00007ff836567040        1           48 System.SharedStatics
00007ff835e7b208        1           64 System.EventHandler`1[[Windows.Foundation.Diagnostics.TracingStatusChangedEventArgs, mscorlib]]
00007ff836566f28        3           72 System.Object
00007ff836534af8        1           80 System.Threading.PinnableBufferCache
00007ff831f51b88        1           80 System.Diagnostics.BooleanSwitch
00007ff831f45468        3          120 System.Net.TimerThread+TimerQueue
00007ff836567f70        2          128 System.Security.PermissionSet
00007ff836566e50        1          160 System.ExecutionEngineException
00007ff836566dd8        1          160 System.StackOverflowException
00007ff836566b78        1          160 System.Exception
00007ff836563900        1          192 System.Diagnostics.Tracing.FrameworkEventSource
00007ff836559a50        1          192 System.Threading.Tasks.TplEtwProvider
00007ff836548898        1          192 System.Collections.Concurrent.CDSCollectionETWBCLProvider
00007ff836534b58        1          192 System.Threading.PinnableBufferCacheEventSource
00007ff836525928        1          192 System.Threading.CdsSyncEtwBCLProvider
00007ff831f356e8        1          192 System.PinnableBufferCacheEventSource
00007ff82d10a570        1          192 System.Web.TelemetryEventSource
00007ff836567100        1          216 System.AppDomain
00007ff836574be0        2          320 System.NotSupportedException
00007ff836566ec8        2          320 System.Threading.ThreadAbortException
00007ff831f39f88        4          320 System.PinnableBufferCache
00007ff831f45658        6          336 System.Net.Logging+NclTraceSource
00007ff8365602b0        7          448 Microsoft.Win32.UnsafeNativeMethods+ManifestEtw+EtwEnableCallback
00007ff831f45740        6          480 System.Diagnostics.SourceSwitch
00007ff831f34d10        3          600 System.Net.ServicePoint
00007ff7d98575e0        4          704 common2.objectevents.GlobalCacheRetryException
00007ff8388c0ca8       33         2112 System.DirectoryServices.Protocols.VERIFYSERVERCERT
00007ff8388c5150       33         3168 System.DirectoryServices.Protocols.LdapConnection
00007ff83656c988        7        20504 System.Byte[]
00007ff836566d60      153        24480 System.OutOfMemoryException
00007ff836567d28     1263       121248 System.Threading.Thread
00007ff836548510        1       156336 System.Int64[]
00007ff83654a7d0     3835       276120 System.Reflection.Emit.DynamicResolver
00007ff83657fc70     2084       333440 System.RuntimeType+RuntimeTypeCache
00007ff836566fc0      480       502616 System.Object[]
00007ff835e6f918    90017     10081904 System.Threading.OverlappedData
Total 97964 objects

Handles:
    Strong Handles:       815
    Pinned Handles:       23
    Async Pinned Handles: 90017
    Ref Count Handles:    1
    Weak Long Handles:    5959
    Weak Short Handles:   1149

The faulting thread and the stack

0:134> ~#s
ntdll!NtWaitForSingleObject+0xa:
00007ff8`4b19070a c3              ret
0:134> !pe
Exception object: 00000097711df1a8
Exception type:   System.OutOfMemoryException
Message:          <none>
InnerException:   <none>
StackTrace (generated):
    SP               IP               Function
    0000009FF5110C60 0000000000000000 mscorlib_ni!System.Threading.Monitor.Enter(System.Object)+0x1
    0000009FF5110C60 00007FF83638C87E mscorlib_ni!System.Threading.Tasks.Task.AddException(System.Object, Boolean)+0xae
    0000009FF5110CD0 00007FF836D1B4C8 mscorlib_ni!System.Threading.Tasks.Task.HandleException(System.Exception)+0x88
    0000009FF5110D20 00007FF83638D4AB mscorlib_ni!System.Threading.Tasks.Task.Execute()+0xbb
    0000009FF511E850 00007FF83633CA72 mscorlib_ni!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+0x162
    0000009FF511E920 00007FF83633C904 mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+0x14
    0000009FF511E950 00007FF83638D6DC mscorlib_ni!System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)+0x21c
    0000009FF511EA00 00007FF83638CDF3 mscorlib_ni!System.Threading.Tasks.Task.ExecuteEntry(Boolean)+0x73
    0000009FF511EA40 00007FF836374882 mscorlib_ni!System.Threading.ThreadPoolWorkQueue.Dispatch()+0x152

Heap of free objects

0:134> !dumpheap -stat -type Free
Statistics:
              MT    Count    TotalSize Class Name
00007ff831f35af8        1           32 System.Net.SafeLocalFree
00007ff831f39188        1           48 System.Net.SafeFreeCredential_SECURITY
00007ff831f3abd0       11          440 System.Net.SafeFreeContextBufferChannelBinding_SECURITY
0000009032e03d20 18266055   5247426202      Free
Total 18266068 objects
Fragmented blocks larger than 0.5 MB:
            Addr     Size      Followed by
0000009052cca810   13.7MB 0000009053a72bc0 dbtaskmessages.KeyValueObject
00000090ef377af0   11.4MB 00000090efedffe0 System.String
00000090f14cf6c0   20.8MB 00000090f2996d18 dbtaskmessages.KeyValueObject
000000912a19b2b8   12.6MB 000000912ae3dbb0 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Int64, mscorlib],[System.Object, mscorlib]]
00000091ec96bbd0   10.3MB 00000091ed3bd0f8 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Int64, mscorlib],[System.Object, mscorlib]]
00000091ed8e66f8   13.0MB 00000091ee5efe20 System.String
00000091efce52b8   10.1MB 00000091f07067b0 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Int64, mscorlib],[System.Object, mscorlib]]
000000922a3946e0   10.1MB 000000922ada5c48 System.Collections.Concurrent.ConcurrentQueue`1+Segment[[System.Action, mscorlib]]
000000926c6395f8   15.6MB 000000926d5d02e0 System.Int32[]
000000943169ca50   11.4MB 000000943220d3c0 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Int64, mscorlib],[System.Object, mscorlib]]
00000094a39033d0   12.4MB 00000094a4573ee0 System.Collections.Concurrent.ConcurrentDictionary`2+Node[[System.Int64, mscorlib],[System.Object, mscorlib]]
000000955e5ef658   10.3MB 000000955f044318 common2.database.GlobalCaching.CachedObjectQueryTable`3+CachedCollection[[HnG_States.ProtoObjects.SocialConnection, HnG_States],[HnG_States.ProtoObjects.Indexes.SocialConnectionIndex+IndexByInviterAndInvitee, HnG_States],[System.String, mscorlib]]
00000098dcae2108   10.6MB 00000098dd57b508 common2.database.CachingKeyValueObject
00000098e7f6cb68   14.2MB 00000098e8d9da30 System.String
0000009ba8be9868   10.4MB 0000009ba9655030 Microsoft.Win32.SafeHandles.SafeWaitHandle
0000009baf6351a0   23.9MB 0000009bb0e18a20 System.UInt64[]
0000009be4ae9160   12.8MB 0000009be57a94c8 System.String

This list only contains objects >10Mb - in total there is 1145 objects

The CLR stack from the faulting thread

    0:134> !clrstack
OS Thread Id: 0x1778 (134)
        Child SP               IP Call Site
0000009ff5110b68 00007ff84b19070a [HelperMethodFrame_1OBJ: 0000009ff5110b68] System.Threading.Monitor.Enter(System.Object)
0000009ff5110c60 00007ff83638c87e System.Threading.Tasks.Task.AddException(System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2041]
0000009ff5110cd0 00007ff836d1b4c8 System.Threading.Tasks.Task.HandleException(System.Exception) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2929]
0000009ff5110d20 00007ff83638d4ab System.Threading.Tasks.Task.Execute() [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2515]
0000009ff5115cd8 00007ff838d3120d [GCFrame: 0000009ff5115cd8] 
0000009ff5115e78 00007ff838d3120d [HelperMethodFrame_2OBJ: 0000009ff5115e78] System.Environment.GetResourceFromDefault(System.String)
0000009ff5115f70 00007ff83630240e System.Environment.GetResourceString(System.String, System.Object[]) [f:\dd\ndp\clr\src\BCL\system\environment.cs @ 1332]
0000009ff5115fb0 00007ff7da46ccac common2.taskservice.SimpleLockingTaskQueue.RunReadonlyEvent(TaskQueueEntry)
0000009ff511af78 00007ff838d3120d [GCFrame: 0000009ff511af78] 
0000009ff511b118 00007ff838d3120d [HelperMethodFrame_2OBJ: 0000009ff511b118] System.Environment.GetResourceFromDefault(System.String)
0000009ff511b210 00007ff83630240e System.Environment.GetResourceString(System.String, System.Object[]) [f:\dd\ndp\clr\src\BCL\system\environment.cs @ 1332]
0000009ff511b250 00007ff836386e87 System.Exception.ToString(Boolean, Boolean) [f:\dd\ndp\clr\src\BCL\system\exception.cs @ 439]
0000009ff511b2a0 00007ff7da362b87 common2.objectevents.RequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].EventCall(System.Action)
0000009ff511d408 00007ff838d3120d [HelperMethodFrame_1OBJ: 0000009ff511d408] System.Threading.Monitor.Enter(System.Object)
0000009ff511d500 00007ff7da0aa16a System.Collections.Concurrent.ConcurrentDictionary`2[[System.Int64, mscorlib],[System.__Canon, mscorlib]].AcquireLocks(Int32, Int32, Int32 ByRef) [f:\dd\ndp\clr\src\BCL\system\Collections\Concurrent\ConcurrentDictionary.cs @ 1911]
0000009ff511d560 00007ff7da489e91 System.Collections.Concurrent.ConcurrentDictionary`2[[System.Int64, mscorlib],[System.__Canon, mscorlib]].GetKeys() [f:\dd\ndp\clr\src\BCL\system\Collections\Concurrent\ConcurrentDictionary.cs @ 1955]
0000009ff511d5c0 00007ff7da489c4a common2.database.KeyIdTable`1[[System.__Canon, mscorlib]].GetIds(System.Collections.Generic.List`1, System.__Canon)
0000009ff511d620 00007ff7da489864 common2.database.SimpleObjectQueryTable`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Get(System.__Canon, common2.database.RetrieveContext)
0000009ff511d7d0 00007ff7da4891e5 common2.database.ObjectTable.Query[[System.__Canon, mscorlib]](System.__Canon, common2.database.RetrieveContext)
0000009ff511d880 00007ff7da488ecf common2.database.GlobalCaching.ContextualPureObjectQuery`1[[System.__Canon, mscorlib]].Query[[System.__Canon, mscorlib]](System.__Canon)
0000009ff511d980 00007ff7da488236 common2.database.CacheObjectQuery`1[[System.__Canon, mscorlib]].Query[[System.__Canon, mscorlib]](System.__Canon)
0000009ff511daa0 00007ff7da5d661d HnG_States.timeevent_extensions.By_ObjType_ObjId(common2.database.Interfaces.IObjectTableQuery`1, System.String, Int64)
0000009ff511db00 00007ff7da5cb610 WarServer.StrategyCleanup_CronJob.CleanupTransports(common2.database.Interfaces.IObjectDbConnection2, HnG_States.ProtoObjects.war)
0000009ff511dca0 00007ff7da5c9aab WarServer.StrategyCleanup_CronJob.HandleCronJob()
0000009ff511dd70 00007ff7da5c28fa common2.objectevents.CronJobRequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Handle()
0000009ff511deb0 00007ff7da3623ec common2.objectevents.RequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].EventCall(System.Action)
0000009ff511e240 00007ff7da361eaf common2.objectevents.RequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].HandleEvent()
0000009ff511e2e0 00007ff7da5ae7a0 common2.objectevents.ClientRequestEventHandler`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].HandleEvent()
0000009ff511e390 00007ff7da46bebb common2.taskservice.SimpleLockingTaskQueue.RunReadonlyEvent(TaskQueueEntry)
0000009ff511e7e0 00007ff7da46bcd1 common2.taskservice.SimpleLockingTaskQueue+c__DisplayClass31_0.b__1()
0000009ff511e810 00007ff83638d436 System.Threading.Tasks.Task.Execute() [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2498]
0000009ff511e850 00007ff83633ca72 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]
0000009ff511e920 00007ff83633c904 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]
0000009ff511e950 00007ff83638d6dc System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2827]
0000009ff511ea00 00007ff83638cdf3 System.Threading.Tasks.Task.ExecuteEntry(Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2767]
0000009ff511ea40 00007ff836374882 System.Threading.ThreadPoolWorkQueue.Dispatch() [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 820]
0000009ff511eed8 00007ff838bd6793 [DebuggerU2MCatchHandlerFrame: 0000009ff511eed8] 

Condensed output from !threads

0:134> !threads
ThreadCount:      1154
UnstartedThread:  0
BackgroundThread: 243
PendingThread:    0
DeadThread:       905
Hosted Runtime:   no
                                                                                                        Lock  
       ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
 134 1111 1778 0000009f5fcc1e00  10a9228 Preemptive  00000097711E0D00:00000097711E1190 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000097711df1a8 (nested exceptions)
 135 1250 1288 0000009fef1bc510  1029228 Preemptive  000000A4D5D27438:000000A4D5D28868 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a4d5d258e0 (nested exceptions)
 136 1298  c20 0000009f5fc995c0  10a9228 Preemptive  0000009A2E09FA40:0000009A2E0A02A0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009a2e098cd0 (nested exceptions)
 138 1332 34c8 0000009f5e739410  1029228 Preemptive  000000A4447D6088:000000A4447D68E8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a4447d1198 (nested exceptions)
 137 1309 1bac 0000009f5fd1ded0  1029228 Preemptive  000000905604D918:000000905604E178 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009056033c68 (nested exceptions)
 139 1064 2378 000000a5ed90a830  1029228 Preemptive  000000A3063531D0:000000A306353A30 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a30634e778 (nested exceptions)
 141 1232 33a8 0000009f5ead9840  1029228 Preemptive  000000A87D726CD0:000000A87D727DC0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a87d725178 (nested exceptions)
 142 1136 1fac 000000900689acc0  1029228 Preemptive  00000099EE7269F0:00000099EE726B30 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000099ee724e98 (nested exceptions)
 143 1377 3540 000000a045163fd0  10a9228 Preemptive  00000095667C19F8:00000095667C29A8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000095667bfea0 (nested exceptions)
 144 1241  d00 000000a010ab3870  1029228 Preemptive  0000009165F29128:0000009165F29988 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009165f229c8 (nested exceptions)
 145 1242 2704 00000090071ff070  10a9228 Preemptive  000000952E0F5840:000000952E0F60A0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000952baa2ec8 (nested exceptions)
 147  674 2f2c 0000009f5dfcb720  10a9228 Preemptive  000000A1DB23A7F8:000000A1DB23C388 0000009032df1aa0 0     MTA (Threadpool Worker) common2.objectevents.GlobalCacheRetryException 000000a1db23a6c8
 151 1226 3584 000000900447a040  1029228 Preemptive  00000099308F2770:00000099308F2FD0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000993089daa8 (nested exceptions)
 152 1090 19cc 0000009f5fb2ba50  1029228 Preemptive  000000A306345418:000000A3063469C0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a3063437e0 (nested exceptions)
 155 1209 34ec 0000009f5e96df00  1029228 Preemptive  000000A1DA486760:000000A1DA486FC0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a1da40bdd8 (nested exceptions)
 154 1163 1dac 00000090064c7010  1029228 Preemptive  000000905604B918:000000905604C178 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a6bd558138 (nested exceptions)
 156 1187 1ea8 0000009006e05630  10a9228 Preemptive  000000A718498E00:000000A7184993F0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a71848f1f0 (nested exceptions)
 157  248 2a9c 000000a82f251800  10a9228 Preemptive  000000A5F3B38500:000000A5F3B39398 0000009032df1aa0 1     MTA (Threadpool Worker) System.NotSupportedException 000000a5f3b366f0
 158 1339 21ec 000000a036221870  1029228 Preemptive  0000009C2A8F2D38:0000009C2A8F3400 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009c2a8bf188 (nested exceptions)
 159 1171  194 0000009fef2b0dd0  1029228 Preemptive  000000A5F383A1B0:000000A5F383B368 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a5f3838658 (nested exceptions)
 161 1215 2224 000000a5ed90c770  1029228 Preemptive  000000A22583CDB0:000000A22583D610 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a225035148 (nested exceptions)
 162 1073 1e68 0000009fef17d350  10a9228 Preemptive  000000A8360DD5D0:000000A8360DDB90 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a8360dba78 (nested exceptions)
 163 1381 34a4 0000009004477160  1029228 Preemptive  000000A718488710:000000A7184893F0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a718486bb8 (nested exceptions)
 164 1365 3684 0000009006b62e90  10a9228 Preemptive  0000009B2EAC16D8:0000009B2EAC3648 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009b2eab7b48 (nested exceptions)
 171 1323 13f0 0000009f413dc5f0  10a9228 Preemptive  000000A87DF9FB48:000000A87DF9FC88 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a87df9dff0 (nested exceptions)
 175 1274 1368 000000a300c6ffc0  10a9228 Preemptive  0000009C2A9D0C00:0000009C2A9D1460 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009c2a9bc720 (nested exceptions)
 166 1240 34b4 0000009fef17eac0  1029228 Preemptive  000000A2C1FEAE78:000000A2C1FEB6D8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a2c1fda230 (nested exceptions)
 174 1251 3014 000000a13449a060  1029228 Preemptive  000000952DC45020:000000952DC46B68 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000952dc434c8 (nested exceptions)
 168 1362 147c 000000a01d889030  10a9228 Preemptive  000000A6BE3C7F18:000000A6BE3C8778 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a6be3b96e0 (nested exceptions)
 167 1366 23d4 0000009004c7c830  1029228 Preemptive  00000096AC78B9E0:00000096AC78C240 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000096ac6b88d0 (nested exceptions)
 179 1324  928 0000009f41579010  1029228 Preemptive  00000099AD7D8D48:00000099AD7DAD38 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000099ad7d71f0 (nested exceptions)
 177 1221 13e4 000000a045164f70  1029228 Preemptive  000000A5A1495748:000000A5A1495FA8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a5a145ac98 (nested exceptions)
 178 1293 23a8 0000009006bf9af0  1029228 Preemptive  00000094AE08CFE0:00000094AE08EF58 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000094ae08b488 (nested exceptions)
 181 1349 29e4 000000a045162090  1029228 Preemptive  0000009165F272C0:0000009165F27988 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009165f1f6a8 (nested exceptions)
 184 1296 1f84 0000009f417a4dc0  1029228 Preemptive  0000009A2D8D3A98:0000009A2D8D42F8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009a2d84dcd8 (nested exceptions)
 193 1102 11a4 0000009006a26610  1029228 Preemptive  000000A2764CC658:000000A2764CDBD8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a2764cab00 (nested exceptions)
 186 1210 25ec 0000009f5eadbf50  1029228 Preemptive  00000096AC6B76E0:00000096AC6B79A8 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000096ac6b5b88 (nested exceptions)
 188 1379 13a4 000000a3678a7e20  1029228 Preemptive  000000A36F372090:000000A36F3728F0 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a36f3059d8 (nested exceptions)
 190 1255 17d0 000000a3678a9590  1029228 Preemptive  000000986D851380:000000986D853138 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000986d84f828 (nested exceptions)
 191 1235 193c 0000009f4769c770  1029228 Preemptive  000000A834B596F0:000000A834B59F50 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 000000a834ad32f0 (nested exceptions)
 205  783 2fcc 0000009f46811e50  10a9228 Preemptive  0000009BAF601AC8:0000009BAF603188 0000009032df1aa0 0     MTA (Threadpool Worker) common2.objectevents.GlobalCacheRetryException 0000009baf6016b0
 208 1357 3ac4 0000009f5dfcaf50  1029228 Preemptive  0000009165F25128:0000009165F25988 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 0000009165f1c3b8 (nested exceptions)
 210 1243 2884 0000009fef2af660  1029228 Preemptive  00000099EE7122D0:00000099EE712B30 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000099ee6ff608 (nested exceptions)
 211 1239 263c 000000a810bcde10  1029228 Preemptive  0000009464764D30:0000009464764F30 0000009032df1aa0 0     MTA (Threadpool Worker) System.OutOfMemoryException 00000094647631d8 (nested exceptions)
 219 1183 3820 000000a34ad75a10  8029228 Preemptive  000000986D8548D8:000000986D855138 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 000000986d83e5f0 (nested exceptions)
 220 1217 1968 0000009fef12abb0  8029228 Preemptive  00000099EE716110:00000099EE716B30 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 00000099ee713f70 (nested exceptions)
 224 1335 1e44 0000009f5dfc9010  8029228 Preemptive  000000A87D7BD2F0:000000A87D7BDE50 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 000000a87d7bb150 (nested exceptions)
 223 1327 3244 0000009f418eb490  8029228 Preemptive  000000A5A149AD40:000000A5A149BFA8 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 000000a5a1498ba0 (nested exceptions)
 225 1228  4e0 000000a01e879e10  8029228 Preemptive  000000A3FDE5D5C0:000000A3FDE5DE20 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 000000a3fde548e0 (nested exceptions)
 229 1354 3550 0000009006873d60  8029228 Preemptive  00000094EEB90F28:00000094EEB91788 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 00000094eeb8c548 (nested exceptions)
 230 1152 2b3c 0000009006875ca0  8029228 Preemptive  0000009AE81807E0:0000009AE8180D58 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 0000009ae8148908 (nested exceptions)
 246 1265 3670 0000009f5eada010  80a9228 Preemptive  000000A8BFA16770:000000A8BFA16FD0 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 000000a8bfa114b8 (nested exceptions)
 272 1029 2c3c 0000009004ef0be0  8029228 Preemptive  0000009AE8CB9950:0000009AE8CBA460 0000009032df1aa0 0     MTA (Threadpool Completion Port) System.OutOfMemoryException 0000009ae8cb97f0
 213  949  c4c 0000009fef212f00    a0228 Preemptive  0000009058538D40:000000905853AC20 0000009032df1aa0 0     MTA System.OutOfMemoryException 0000009058538c58

I've filtered out (Threadpool Worker) and (Threadpool Completion Port) threads

Edit2 Even more info: We have been running perfmon on the side and one thing we can see is, that before the crash, the number of "Sink blocks in use" goes crazy. It goes from stable 200-300ish to 60+ million... Sometimes it can recover from this, but sometimes it can't and then it crashes.

Any clues to what could cause this?

Edit3: Added image of sink lock https://i.stack.imgur.com/vECQZ.jpg

Edit4: We are running with GC in server mode and

System.Runtime.GCSettings.LatencyMode = System.Runtime.GCLatencyMode.SustainedLowLatency;

Henning

  • Using .Net I take it? If so, then the large object heap may have become fragmented. This is not defragged by default, but [you can initiate it programmatically](https://msdn.microsoft.com/en-us/library/system.runtime.gcsettings.largeobjectheapcompactionmode(v=vs.110).aspx) on .Net 4.51 or later. – Matthew Watson Oct 04 '17 at 15:39
  • 3
    There is more than one way to run out of memory. These particular failures are buried inside the CLR so hard to diagnose. Running out of kernel memory pool or hitting a handle quota is a distinct possibility with such a giant process. – Hans Passant Oct 04 '17 at 15:40
  • 1
    @MatthewWatson I'm somewhat reluctant to believe memory fragmentation is an issue for 64bit address space? – Henning Semler Oct 04 '17 at 19:37
  • @HansPassant That sound more like what I would expect. Any ideas on how to figure out what it is? Would a sxstrace help here? – Henning Semler Oct 04 '17 at 19:38
  • 1
    90000+ async pinned handles, yikes. Very hard to imagine code that could do this. Some background info [is here](https://stackoverflow.com/a/7555664/17034). – Hans Passant Oct 05 '17 at 12:08
  • @ThomasWeller Yes, I know we have a problem :) Any clues to what could cause this, or how I approach finding the issue? – Henning Semler Oct 06 '17 at 07:49

0 Answers0