3

I have a C#/Managed process that failed with an out of memory exception, but inspecting the dump shows very little memory actually being used. I'm hoping for some help in figuring out next steps to investigate.

I have a full memory dump of the offending process and I'm using WinDBG with SOS to try to figure out what's going on.

  1. Memory Dump Size is 1,949,280,487 bytes
  2. !EEHeap -gc gives GC Heap Size: Size: 0xd8dcd0 (14,212,304) bytes. !VerifyHeap and !DumpHeap -stat show consistent numbers so I think I should look at native memory rather than managed memory
  3. !heap -s shows one heap with 1,409,884 k Commit, which looks promising (details below)
  4. !heap -stat -h 01740000 (for the 1.4GB heap) indicates a minimal amount of memory in use (details below). The first entry has a total size of 0x25fec=155,628 bytes and calls that 2.98% of "total busy bytes" That would seem to indicate 5,222,416 total busy bytes which isn't even comparable to the commit size of the heap.

At this point, I hit the end of what I feel competent investigating and move into the realm of wild speculation.

  • Could this just be native memory fragmentation? It seems like ~1.4 GB of commit vs. ~5 MB in use is way to big to be explained by fragmentation.
  • !address shows huge swaths of alternation between 0x41000 PAGE_READWRITE and 0x1000 PAGE_NOACCESS. What could explain all of that?
  • Inside the PAGE_READWRITE chunks, the contents is mostly null with
    • db 7fdd8000 L41000 shows mostly null bytes with repetitions of the same string interspersed
    • for example dU 7fdd804C gives 0=goog_update_data.3=adk%3D1111111111%26tt%3D1111111%26bs%3D1111%2C111%26mtos%3D0%2C1111111%2C1111111%2C1111111%2C1111111%26tos%3D0%2C1111111%2C0%2C0%2C0%26p%3D111%2C111%2C111%2C1111%26iehp%3D1%26mcvt%3D1111111%26rs%3D3%26ht%3D0%26tfs%3D1111%26tls%3D1111111%26mc%3D0.11%26lte%3D-1%26bas%3D0%26bac%3D0%26avms%3Dgeo%26bos%3D1111%2C1111%26ps%3D1111%2C1111%26ss%3D1920%2C1080%26pt%3D0%26d (potentially-sensitive numbers replaced with 1 which HTML decodes to 0=goog_update_data.3=adk=1111111111&tt=1111111&bs=1111,111&mtos=0,1111111,1111111,1111111,1111111&tos=0,1111111,0,0,0&p=111,111,111,1111&iehp=1&mcvt=1111111&rs=3&ht=0&tfs=1111&tls=1111111&mc=0.11<e=-1&bas=0&bac=0&avms=geo&bos=1111,1111&ps=1111,1111&ss=1920,1080&pt=0&d
    • The same string repeats itself several times within the block
    • The app in question hosts a managed WebBrowser control. Is it possible that this is a JavaScript leak in a hosted webpage and not my app itself leaking?
    • Are those PAGE_NOACCESS some form of protection against buffer overlows? Is this heap running some strange debugging GFlags I don't know about?

!heap -s

LFH Key                   : 0xd4a58b52
Termination on corruption : DISABLED
  Heap     Flags   Reserv  Commit  Virt   Free  List   UCR  Virt  Lock  Fast 
                    (k)     (k)    (k)     (k) length      blocks cont. heap 
-----------------------------------------------------------------------------
01740000 00000002 1409924 1409884 1409868    416   320   293    2      0   LFH
014e0000 00001002     116     64     60     10     7     1    0      0   LFH
01c20000 00001002   53284  52388  53228    491   116    14    0      0   LFH
03590000 00001002      60      4     60      0     1     1    0      0      
01b80000 00041002      60      4     60      2     1     1    0      0      
05a50000 00001002    1136    124   1080     24    10     2    0      0   LFH
05b40000 00001002      60     12     60      4     2     1    0      0      
061f0000 00041002     116     64     60     16     4     1    0      0   LFH
07410000 00001002      60     12     60      4     2     1    0      0      
073f0000 00001002    3180   1156   3124     51    37     3    0      0   LFH
09b60000 00001002    3180   1716   3124     17     7     3    0      0   LFH
09af0000 00001002      60      4     60      2     1     1    0      0      
0ece0000 00001002      60      8     60      6     1     1    0      0      
13140000 00001002    1136    336   1080    301    10     2    0      0   LFH
13130000 00001002    1136    116   1080     42    13     2    0      0   LFH
133a0000 00001002     180     68    124     42     6     1    0      0   LFH
-----------------------------------------------------------------------------

!heap -stat -h 01740000

 heap @ 01740000
group-by: TOTSIZE max-display: 20
    size     #blocks     total     ( %) (percent of total busy bytes)
    25fec 1 - 25fec  (2.98)
    23d0c 1 - 23d0c  (2.81)
    2000 f - 1e000  (2.36)
    1c843 1 - 1c843  (2.24)
    10 1896 - 18960  (1.93)
    17b8c 1 - 17b8c  (1.86)
    50 434 - 15040  (1.65)
    a36c 2 - 146d8  (1.60)
    114 10a - 11ec8  (1.41)
    c 17a7 - 11bd4  (1.39)
    84 215 - 112d4  (1.35)
    4000 4 - 10000  (1.26)
    fc24 1 - fc24  (1.24)
    10c9 f - fbc7  (1.24)
    3bc 43 - fa34  (1.23)
    80 1e2 - f100  (1.18)
    800 1e - f000  (1.18)
    4fe8 3 - efb8  (1.18)
    12c9 c - e16c  (1.11)
    d0 10b - d8f0  (1.06)

!address -summary

--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Heap                                  11774          5a0b8000 (   1.407 GB)  70.78%   70.35%
<unknown>                              1652          131f3000 ( 305.949 MB)  15.03%   14.94%
Image                                  1090          10649000 ( 262.285 MB)  12.88%   12.81%
Stack                                    93           1a10000 (  26.063 MB)   1.28%    1.27%
Free                                    399            c4f000 (  12.309 MB)            0.60%
Other                                    14             53000 ( 332.000 kB)   0.02%    0.02%
TEB                                      31             47000 ( 284.000 kB)   0.01%    0.01%
PEB                                       1              3000 (  12.000 kB)   0.00%    0.00%

--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE                           13378          67145000 (   1.611 GB)  81.02%   80.53%
MEM_IMAGE                              1199          12bf5000 ( 299.957 MB)  14.74%   14.65%
MEM_MAPPED                               78           5667000 (  86.402 MB)   4.24%    4.22%

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_COMMIT                            13802          758dc000 (   1.837 GB)  92.40%   91.84%
MEM_RESERVE                             853           9ac5000 ( 154.770 MB)   7.60%    7.56%
MEM_FREE                                399            c4f000 (  12.309 MB)            0.60%

--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE                         7089          5dd90000 (   1.466 GB)  73.76%   73.32%
PAGE_EXECUTE_READ                       164           c603000 ( 198.012 MB)   9.73%    9.67%
PAGE_READONLY                           431           8d13000 ( 141.074 MB)   6.93%    6.89%
PAGE_NOACCESS                          5698           1642000 (  22.258 MB)   1.09%    1.09%
PAGE_WRITECOPY                          211            cee000 (  12.930 MB)   0.64%    0.63%
PAGE_EXECUTE_READWRITE                   87            291000 (   2.566 MB)   0.13%    0.13%
PAGE_EXECUTE_WRITECOPY                   38            12f000 (   1.184 MB)   0.06%    0.06%
PAGE_EXECUTE                             20             a7000 ( 668.000 kB)   0.03%    0.03%
PAGE_READWRITE|PAGE_GUARD                64             9f000 ( 636.000 kB)   0.03%    0.03%

--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
Heap                                        178af000            346000 (   3.273 MB)
<unknown>                                    f613000           1ead000 (  30.676 MB)
Image                                       63341000           10ae000 (  16.680 MB)
Stack                                        de80000             fd000 (1012.000 kB)
Free                                               0             10000 (  64.000 kB)
Other                                       7f2a0000             23000 ( 140.000 kB)
TEB                                          10cc000              3000 (  12.000 kB)
PEB                                          10ba000              3000 (  12.000 kB)

!address (subset of results

7f3d8000 7f419000    41000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f419000 7f41a000     1000 MEM_PRIVATE MEM_COMMIT  PAGE_NOACCESS                      Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f41a000 7f45b000    41000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f45b000 7f45c000     1000 MEM_PRIVATE MEM_COMMIT  PAGE_NOACCESS                      Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f45c000 7f49d000    41000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f49d000 7f49e000     1000 MEM_PRIVATE MEM_COMMIT  PAGE_NOACCESS                      Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f49e000 7f4df000    41000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f4df000 7f4e0000     1000 MEM_PRIVATE MEM_COMMIT  PAGE_NOACCESS                      Heap       [ID: 0; Handle: 01740000; Type: Segment]
7f4e0000 7f521000    41000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     Heap       [ID: 0; Handle: 01740000; Type: Segment]
Whatabohr
  • 41
  • 3
  • Could be due to fragmentation? – PepitoSh Jun 11 '18 at 04:27
  • Is it plausible 5MB of busy bytes would correspond to 1.4GB of commit even if it was incredibly fragmented? – Whatabohr Jun 11 '18 at 04:38
  • Also, if you are suspecting Webbrowser, try CefSharp: https://cefsharp.github.io/ – zaitsman Jun 11 '18 at 05:06
  • Just for curiosity what are you trying to do when you get the error? – TheGeneral Jun 11 '18 at 05:11
  • @TheGeneral The specific error occurred in an attempt to Base64 encode a 40KB byte array. – Whatabohr Jun 11 '18 at 05:21
  • @Whatabohr gut feeling based on sum of your address summary is that about 2 GB is reserved for CLR running your app, and more is not addressable. Try switching to 64bit process, if you do have more RAM. Otherwise, more details are needed as to how you use the web browser control, what is going on etc. – zaitsman Jun 11 '18 at 05:37
  • @zaitsman: how do you come to this conclusion? Do you have any official sources for this? Otherwise I'd call your statement wrong and misleading. The 2 GB limit is the non-LAA limit and has nothing to do with .NET. The .NET heap manager works directly above `VirtualAlloc()` and in this case is responsible for a maximum of 305 MB as shown in the `unknown` line of memory. Switching to 64 bit is indeed an option, but has nothing to do with RAM. The address space of a process is virtual memory, not RAM. It will be swapped to disk, if needed. – Thomas Weller Jun 11 '18 at 06:23
  • @Whatabohr: If you suspect some GFlags settings already, it would be useful to know if "Page Heap" was turned on. The full page heap can cause up to 2000 times use of memory, so OOM situations are indeed much more likely. The `!gflag` command should show the settings. – Thomas Weller Jun 11 '18 at 06:33
  • @ThomasWeller `!gflag` gives: `Current NtGlobalFlag contents: 0x00000000`. Thanks for pointing that command out to me. It helps to eliminate at least one of my crazy theories – Whatabohr Jun 11 '18 at 06:37
  • @ThomasWeller something like this: https://stackoverflow.com/questions/14186256/net-out-of-memory-exception-used-1-3gb-but-have-16gb-installed Try `LARGEADDRESSAWARE` I'd ask - does the error go away if you compile in 64 bit? – zaitsman Jun 11 '18 at 06:57
  • if you can repro it, use ETW (Perfview, Windows Performance Toolkit) or any other .Net Profiler to trace the memory usage grow. – magicandre1981 Jun 11 '18 at 14:43
  • 1
    @magicandre1981: why use a .NET profiler if most of the memory is in `Heap` and not in ``? Should UMDH be better suited? – Thomas Weller Jun 12 '18 at 06:31
  • Agree with UMDH. Also would be curious around the managed stack for the outofmemory exception. – kvr Jun 28 '18 at 05:10

1 Answers1

1

Thanks everyone for the help and pointers on this. We were eventually able to isolate the offending component to a hosted WinForms WebBrowserControl that was loading an external page which itself was leaking memory in the browser. We were able to mitigate the issue by simply changing configuration to no longer load that obsolete and buggy page in our dashboard (we were lucky that it wasn't useful anymore).

In retrospect, the indications were all there that the leak was neither a Managed(.NET) leak nor a purely native one since none of the tools we tried narrowed it down.

Whatabohr
  • 41
  • 3