24

I'm experiencing a problem in a wpf app where the render thread stops rendering, but the UI thread and helper threads are still pumping messages.

It appears to be related to the corruption of the presentation font cache, however this seems unlikely, as the app recovers fine on reboot.

The render thread will occasionally hang, preventing drawing updates, but the UI thread is still pumping messages.

We have seen a similar issue (similar to here) that occured when applying a scale transform to a textblock that was solved by deleting the font cache, however this particular problem is not reliably repeatable.

What is the best way to diagnose the root cause of this problem?

I have open a bug with microsoft at connect, but it will not be considered unless others vote it up.

Community
  • 1
  • 1
LukeN
  • 1,698
  • 15
  • 29

5 Answers5

5

The freeze was caused by a hosted activex control rendering video.

There was a race condition in the way the control used directshow which caused directx to hang.

We found this problem by taking a process dump using procdump and then opening the dump file in the windows debugger.

Hunting around on the net and inspecting the native callstacks showed a problem where the high order byte of a critical section pointer was zeroed, which meant that one of the threads was waiting on a non-existent critical section which can never be signaled.

This allowed us to create a repeatable hang by exercising the code that started and stoped the video. We removed the controls, and the hang stopped.

LukeN
  • 1,698
  • 15
  • 29
  • Hi, I'm facing the same issue, could you please elaborate on which control exactly gave you this error and how you used procdump to monitor it? (which command did you use?) Thanks! – Uri Abramson Jul 28 '14 at 14:32
  • There is a video at [channel 9](http://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-9-ProcDump) which will show you how to use it. Making sense of the native stack frame is going to be more difficult if you haven't written much c/c++. Helpers on how to use windbg: [cheat sheet](http://theartofdev.com/windbg-cheat-sheet/) and [loading .net](http://mylittlereminder.wordpress.com/2011/07/08/windbg-load-sos-in-windbg-0x80004005/) – LukeN Jul 29 '14 at 01:37
1

I don't know why it happens, but I have experienced it before. It's easier to observe it in systems that target framework 4.0 and run on older machines (XP, Vista).

What I did to solve was:

  1. Delete FontCache3.0.0.0.dat
  2. Permanently disable the font cache service on the offending machine

Solution 1 worked in one XP machine. It also worked in a Vista machine, but after a while the problem showed up again.

To delete FontCache3.0.0.0.dat you will need to stop the "Windows Presentation Foundation Font Cache 3.0.0.0" service before you can delete that file. In Vista it is located under c:\windows\serviceprofiles\localservice\appdata\local. In XP it is under c:\windows\system32\documents and settings\localservice\local settings\application data (I might have mispelled some folder)

I have also found that disabling the system altogether (solution 2) did not affect the performance of my .net apps.

Padu Merloti
  • 3,219
  • 3
  • 33
  • 44
  • Hi Padu, I mentioned that the issue appears similar, but recovers on startup which seems to rule out a problem with the font cache. – LukeN Jun 02 '11 at 17:36
1

The only way to find the root cause of the issue is going to be constant logging from the thread until you can find a reason why it hangs. I can suggest lots of ways to do that logging, but it depends on how complex that code in the render thread is. Without having lots of debugging information (the sort of thing that introduces enough latency to temporarily resolve the problem, no less) you're not going to be able to drill down to the one time where it does occur.

If you can repeat it in VS, then you should use some console logging around the anticipated trouble parts, otherwise you're likely going to have to pull it into a textfile, or send it to the system logger.

Can you get it to reoccur in a simple app that only does the rendering and related parts of the rest of the app, or does it only occur (can only occur?) in the full program?

jcolebrand
  • 15,889
  • 12
  • 75
  • 121
  • 1
    The framework is in charge of the render thread - which is the problem. See http://blogs.msdn.com/b/nickkramer/archive/2005/07/19/437025.aspx for details. Essentially, the application is hanging in microsoft's code, while my code is still running as it should. What I'm interested in knowing is whether problems with the font cache service are actually causing the hang, or is there another similar problem in the internals of the WPF libraries which could cause the issue. If the latter is true, how would I go about testing it (given that I don't understand the internals of wpf or directx). – LukeN Jun 06 '11 at 00:32
  • can you not reflect it to see what it's doing? surely you know about reflector or ILSpy, yes? – jcolebrand Jun 06 '11 at 01:22
0

You could use a watchdog service that clears the cache when the service detects the issue. The service would have to poll on a periodic basis whenever your application is running.

I will be the first to admit that this is a sub-optimal solution and, unless you are able to flip the service on and off for only small slices of time, you are likely to drain batteries rather quickly.

Zian Choy
  • 2,846
  • 6
  • 33
  • 64
  • My question is not how to workaround the issue. I'm looking for help diagnosing the root cause. – LukeN Jun 01 '11 at 05:45
0

I think you must use a looping poll which continuously checks if the software is running and resets it whenever the software hangs.

jcolebrand
  • 15,889
  • 12
  • 75
  • 121
Eljay
  • 941
  • 5
  • 15
  • 30