12

Question:

Is there an easy way to get a list of types of resources that leak in a running application? IOW by connecting to an application ?

I know memproof can do it, but it slows down so much that the application won't even last a minute. Most taskmanager likes can show the number, but not the type.

It is not a problem that the check itself is catastrophic (halts the app process), since I can check with a taskmgr if I'm getting close (or at least I hope)

Any other insights on resource leak hunting (so not memory) is also welcomed.

Background:

I've an Delphi 7/2006/2009 app (compiles with all three) and after about a few week it starts acting funny. However only on one of the places it runs, on several other systems it runs till the power goes out.

I've tried to put in some debug code to narrow the problem down. and found out that the exception is EOutofResources on a save of a file. (the file save can happen thousands of times a day).

I have tried to reason out memory leaks (with fastmm), but since the dataflow is quite high (60MByte/s from gigabit industrial camera's), I can only rule out "creeping" memory leaks with fastmm, not quick flashes of memoryleaks that exhaust memory around the time it happens. If something goes wrong, the app fills memory in under half a minute.,

Main suspects are filehandles that are somehow left on some error and TMetafiles (which are streamed to these files). Minor suspects are VST, popupmenu and tframes

Updates:

Another possible tip: It ran fine for two years with D7, and now the problems are with Turbo Explorer (which I use for stable projects not converted to D2009 ).

Paul-Jan: Since it only happens once a week (and that can happen at night), information acquisition is slow. Which is why I ask this question, need to combine stuff for when I'm there thursday. In short: no I don't know 100% sure. I intend to bring the entire Systemtools collection to see if I can find something (because then it will be running for days). There is also a chance that I see open files. (maybe should try to find some mingw lsof and schedule it)

But the app sees very little GUI action (it is an machine vision inspection app), except screen refresh +/- 15/s which is tbitmap stretchdraw + tmetafile, but I get this error when saving to disk (TFileStream) handles are probably really exhausted. However in the same stream, TMetafile is also savetostreamed, something which later apps don't have anymore, and they can run from months.

------------------- UPDATE

I've searched and searched and searched, and managed to reproduce the problems in-vitro two or three times. The problems happened when memusage was +/- 256MB (the systems have 2GB), user objects 200, gdi objects 500, not one file more open than expected ).

This is not really exceptional. I do notice that I leak small amounts of handles, probably due to reparenting frames (something in the VCL seems to leak HPalette's), but I suspect the core cause is a different problem. I reuse TMetafile, and .clear it inbetween. I think clearing the metafile doesn't really (always?) resize the resource, eventually each metafile in the entire pool of tmetafile at maximum size, and with 20-40+ tmetafiles (which can be several 100ks each) this will hit the desktop heap limit.

That's theory, but I'll try to verify this by setting the desktop limit to 10MB at the customers, but it will be several weeks before I have confirmation if this changes anything. This theory also confirms why this machine is special (it's possible that this machine naturally has slightly larger metafiles on average). Occasionally freeing and recreating a tmetafile in the pool might also help.

Luckily all these problems (both tmetafile and reparenting) have already been designed out in newer generations of the apps.

Due to the special circumstances (and the fact that I have very limited test windows), this is going to be a while, but I decided to accept the desktop heap as an example for now (though the GDILeaks stuff was also somewhat useful).

Another thing that the audit revealed GDI-types usage in a thread (though only saving tmetafiles (that weren't used or connected otherwise) to streams.

------------- Update 2.

Increasing the desktop limit only seemed to minorly increase the time till the problem occurred.

Unfortunately, I won't be able to follow up on this further, since the machines were updated to a newer version of the framework that doesn't have the problem.

In summary I can only state what the three core modifications were going from the old to the new framework:

  • I no longer change screens by reparenting frames. I now work with forms that I hide and show. I changed this since I also had very rare crashes or exceptions (that could be clicked away) due to this. The crashes were all while operating the GUI though, not spontaneously like the main problem
  • The routine where the crash happened dealt with TMetafile. TMetafile has been designed out, and replace by a simpler own made format. (basically arrays with Opengl vertices)
  • Drawing no longer happened with tbitmap with a tmetafile overlay strechdrawn over it, but using OpenGL.

Of course it could be something else too, that got changed in the rewrite of the above parts, fixing some very nasty detail bug. It would have to be an extremely bad one, since I analysed the above system as much as I could.

Updated nov 2012 after some private mail discussion: In retrospect, the next step would have been adding a counter to the metafiles objects, and simply reinstantiate them every x * 1000 uses or so, and see if that changes anything. If you have similar problems, try to see if you can somewhat regularly destroy and reinitialize long living resources that are dynamically allocated.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • While taskmanager can't show you the type of handles that are leaking, it _can_ help confirm whether there are handles leaking at all. It's not clear to me whether this is the case, perhaps you can update your answer with this info? – Paul-Jan Feb 02 '10 at 06:10
  • I am having almost exactly the same problem right now, but there are no metafiles involved. – Warren P Jul 23 '10 at 16:18
  • How do you change desktop limits? – Warren P Jul 23 '10 at 16:20
  • The same error hunts me for a long time. – Gabriel Oct 30 '11 at 13:15
  • Altar: In retrospect I suspect TMetafile, or TBitmap. To show them they were painted on tpanels, and then optionally they were saved. I suspect something going wrong there. I moved to opengl, and the only problem I have there is painful font support. – Marco van de Voort Oct 30 '11 at 16:24
  • I just added an answer describing how I solved this by not directly painting to the main form canvas which also held other controls. The problem hasn't arose since I made all the drawing occur on its own canvas, rather than the main form's canvas. – Jerry Dodge Sep 23 '19 at 01:37

8 Answers8

13

There is a slim chance that the error is misleading. The VCL naively reports EOutOfResources if it is unable to obtain a DC for a window (see TWinControl.GetDeviceContext in Controls.pas).

I say "naively" because there are other reasons why GetDC() might return a NULL handle and the VCL should report the OS error, not assume an out of resources condition (there is a Windows version check required for this to be reliably possible, but the VCL could and should take of that too).

I had a situation where I was getting the EOutOfResources error as the result of a window handle becoming invalid. Once I'd discovered the true problem, finding the cause and fixing it was simple, but I wasted many, many hours trying to find a non-existent resource leak.

If possible I would examine the stack trace leading to this exception - if it is coming from TWinControl.GetDeviceContext then the problem may not be what you think (it's impossible to say what it might be of course, but eliminating the impossible is always the first step toward discovering the solution, no matter how improbable).

Deltics
  • 22,162
  • 2
  • 42
  • 70
  • Great tip. What may be the reason for GetDC to return a NULL? – Gabriel Oct 30 '11 at 13:18
  • @Altar - the simplest one, and the one that was catching me out in my case, is when the HWND passed to GetDC() is invalid. In my case it was a window that had already been destroyed. Once I had discovered this [with my own call to GetLastError()] it was actually quite simple to see how and why the HWND was being [unexpectedly] destroyed before I had obtained the DC and to fix that problem. – Deltics Oct 30 '11 at 20:20
6

If they are GDI handle leaks you can have a look at MSDN Magazine January 2003 which uses the tool GDILeaks. Other tools are GDIObj or GDIView. Also see here.

Another source of EOutOfResources could be that the Desktop Heap is full. I've had that issue on busy terminal servers with large screens.

If there are lots of file handles you are leaking you could check out Process Explorer and have a look at the open file handles of your process and see any out of the ordinary. Or use WinDbg with the !htrace command.

Lars Truijens
  • 42,837
  • 6
  • 126
  • 143
3

I've run into this problem before. From what I've been able to tell, Delphi may throw an EOutOfResources any time the Windows API returns ERROR_NOT_ENOUGH_MEMORY, and (as the other answers here discuss) Windows may return ERROR_NOT_ENOUGH_MEMORY for a variety of conditions.

In my case, EOutOfResources was being caused by a TBitmap - in particular, TBitmap's call to CreateCompatibleBitmap, which it uses with its default PixelFormat of pfDevice. Apparently Windows may enforce fairly strict systemwide limits on the memory available for device-dependent bitmaps (see, e.g, this discussion), even if your system otherwise has plenty of memory and plenty of GDI resources. (These systemwide limits are apparently because Windows may allocate device-dependent bitmaps in the video card's memory.)

The solution is simply to use device-independent bitmaps (DIBs) instead (although these may not offer quite as good of a performance). To do this in Delphi, set TBitmap.PixelFormat to anything other than pfDevice. This KB article describes how to pick the optimal DIB format for a device, although I generally just use pf32Bit instead of trying to determine the optimal format for each of the monitors the application is displayed on.

Community
  • 1
  • 1
Josh Kelley
  • 56,064
  • 19
  • 146
  • 246
  • I checked, and the source sets all images to pf8bit after creation. The rewrite killed tbitmap in favour of an own image type. (based on array of byte. So my guess is something similar, but then for tmetafile. – Marco van de Voort Aug 10 '13 at 16:58
  • Do you remember the name behind "this mailing list discussion"? the link is dead now. – Wolf Jan 27 '15 at 14:03
  • @Wolf - I don't remember, but I did find another discussion (or another copy of the same discussion) and updated the link. Thanks. – Josh Kelley Jan 27 '15 at 14:23
2

Most of the times I saw EOutOfResources, it was some sort of handle leak.

Did you try something like MadExcept?

--jeroen

Jeroen Wiert Pluimers
  • 23,965
  • 9
  • 74
  • 154
  • I know, which is why I'm looking for a way to find a resourceleak type summary. I used to do this with memproof (that afaik hooked a lot of win32 calls to do this), but that slows down way too much. – Marco van de Voort Feb 02 '10 at 08:00
  • I suggest you focus on stack traceback dumps and figure it out from context, rather than resource summaries. However, I believe AQTime might be able to link individual resource counts with lines of code that created them, which might be useful to you. – Warren P Jul 23 '10 at 16:15
  • 2
    MadExcept 4 has the ability to capture resource leaks (32 bit applications only) – Nicholas Ring Nov 02 '12 at 22:53
2

"I've tried to put in some debug code to narrow the problem down. and found out that the exception is EOutofResources on a save of a file. (the file save can happen thousands of times a day)."

I'm shooting in the dark here, but could it be that you're using the Windows API to (GetTempFileName) create a temp file and you're blowing out some file system indexes or forgetting to close a file handle?

Either way, I do agree that with your supposition about it being a file handle problem. That seems to be the most likely thing given your symptoms and diagnosis.

Allen Bauer
  • 16,657
  • 2
  • 56
  • 74
  • No gettempfilename. I thought about filehandles immediately, and checked. Howevera all major saves are in try finallys in very simple and overviewable thread .execute methods (all commonly done saves are concentrated in an storage thread since they block too long). None of the threads seem to stop (what an unhandled exception would do), but added logging in it starts to return this (handled) EOutofResources. – Marco van de Voort Feb 02 '10 at 08:48
0

Also try to check handle count for the application with Process Explorer from SysInternals. Handle leaks can be very dangerous and they build slowly through time.

Runner
  • 6,073
  • 26
  • 38
0

I am currently having this problem, in software that is clearly not leaking any handles in my own code, so if there are leaks they could be happening in a component's source code or the VCL sourcecode itself.

The handle count and GDI and user object counts are not increasing, nor is anything being created. Deltic's answer shows corner cases where the message is kind of a red-herring, and Allen suggests that even a file write can cause this error.

So far, The best strategy I have found for hunting them down is to use either JCL JCLDEBUG stack tracebacks, or the exception report save features in MadExcept to generate the context information to find out what is actually failing.

Secondly, AQTime contains many tools to help you, including a resource profiler that can keep the links between where the code that created the resources is, and how it was called, along with counts of the total numbers of handles. It can grab results MID RUN and so it is not limited to detecting unfreed resources after you exit. So, run AQTime, do a results capture in mid run, wait several hours, and capture again, and you should have two points in time to compare handle counts. Just in case it is the obvious thing. But as Deltics wisely points out, this exception class is raised in cases where it probably shouldn't have been.

Warren P
  • 65,725
  • 40
  • 181
  • 316
0

I spent all of today chasing this issue down. I found plenty of helpful resources pointing me in the direction of GDI, with the fact that I'm using GDI+ to produce high-speed animations directly onto the main form via timer/invalidate/onpaint (animation performed in separate thread). I also have a panel in this form with some dynamically created controls for the user to make changes to the animation.

It was extremely random and spontaneous. It wouldn't break anywhere in my code, and when the error dialog appeared, the animation on the main form would continue to work. At one point, two of these errors popped up at the same time (as opposed to sequential).

I carefully observed my code and made sure I wasn't leaking any handles related to GDI. In fact, my entire application tends to keep less than 300 handles, according to Task Manager. Regardless, this error would randomly pop up. And it would always correspond with the simplest UI related action, such as just moving the mouse over a standard VCL control.

Solution

I believe I have solved it by changing the logic to performing the drawing within a custom control, rather than directly to the main form as I had been doing before. I think the fact that I was rapidly drawing on the same form canvas which shared other controls, somehow they interfered. Now that it has its own dedicated canvas to draw on, it seems to be perfectly fixed.

That is with about 1 hour of vigorous testing at least.

Animation Working Without Errors

[Fingers crossed]

Jerry Dodge
  • 26,858
  • 31
  • 155
  • 327