2

I have a very strange situation. We have a relatively large app (~500K lines of code), developed by upwards of 10 different developers over the last 6 years. It has been working fine until our most recent release. With the latest build, we have multiple customers complaining that it is sporadically hanging and we are having a heck of a time figuring out how/why! Here are some of the things making it challenging to debug:

  • Until this morning, we have been totally unable to reproduce this problem in house.
  • We have never seen this happen when the debugger is attached! This obviously makes it challenging to solve.
  • This does not tend to happen when people are using the app, but rather after they've stopped using it for some time.
  • Could this (maybe) be related to the screen saver coming on?
  • It seems to happen somewhat consistently when changing screen resolutions

This morning we finally figured out how to reproduce at least one scenario in house by: - Running that app outside of the debugger - Changing the screen resolution. This hangs the app. - Then we can attach with the debugger.

The problem though (now that we can, at least in one case, reproduce it) is that our code is not running when it hangs! In other words, there is only one thread running at the time it hangs, and the line it breaks on is the Application.Run(form); from program.cs.

One final point is that the application is not completely hung. Specifically: - It still does screen painting (refreshes parts of the screen covered by other apps for example) - I can't click on the UI elements that are shown, but it also doesn't "beep" at me like I would expect if it were completely unresponsive - When I "pause" the application after attaching the debugger, I can minimize/maximize it while paused. Otherwise, it won't respond to minimize/maximize commands. - Other than not beeping at me, it behaves as though there is a modal window off-screen that I just can't see. Additional (as mentioned before), when I pause the app, it pauses on the Application.Run line, and there are no other threads/code running in the threads list (as I would expect to see if there were a modal dialog box blocking the main window.

The behavior is most strange IMHO - especially since it has only recently started happening. My next steps are going to have to be to start "subtracting" sections of the code until I find the culprit, but I figured I'd throw the problem up here first and see if anyone else has ever experienced anything like this before.

Thanks in advance for any guidance - I look forward to hearing any suggestions.

ej

Edit: Another way of stating the behavior. After it has "hung" it behaves as though it has no problem, except that it receives no messages from my mouse and keyboard. In other words, it still repaints itself, and can be paused by the VS2010 debugger, but does not respond in any way shape or form to mouse/keyboard events. Here again though, it doesn't start beeping at me as it does with other apps that are truly dead. Like, it does not show as being unresponsive in the task manager. It's just in a sort of weird "I'm not listening for I/O anymore, sorry!" state... Strange!

Edit:

In my last edit, I mentioned that it was not accepting IO anymore. This got me thinking so, I added a TcpListener to see if that would still respond after it "hung" - and it does. Additionally, in thinking about the fact that it still updates the screen, I put a breakpoint in the paint event and got some (more) odd behavior. It hits the breakpoint in the paint event, but is NOT at the top of the call stack at that point. The top of the callstack shows "In sleep, wait or join". Then the line in paint is next, then external code, then main. So the line highlighted on the screen is Green - not Yellow. Additionally, if I F10 (to step over), it does, moving down exactly 1 line, but still, the pain method is not at the top of the call stack. At this point, there are no other threads running, no other code executing, nothing else happening?!?! What is going on here?

One final point - I set a breakpoint in the TcpListener's Accept Socket event, and when I connect to the TCP/IP port, it breaks on that code, and that code IS at the top of the call stack.

Sorry - But I'm still quite confused.

Kara
  • 6,115
  • 16
  • 50
  • 57
eejai42
  • 736
  • 2
  • 12
  • 24
  • is CPU load high when it's hanging? – Matthew Jul 20 '12 at 19:25
  • 1
    And the .net framework version is? Are you using wpf or winforms? – huysentruitw Jul 20 '12 at 19:32
  • No CPU activity. v3.5. I added an additional description of the problem in a recent edit and your question about CPU goes right inline with that additional description. It's not at all (as far as I can tell at least) stuck in an infinite loop or anything. It's not DOING anything - that's the whole problem. It's just not listening to IO requests (keyboard/mouse) anymore... – eejai42 Jul 20 '12 at 19:35
  • Do you by chance double post the same issue? http://stackoverflow.com/questions/11583979/doevents-hanging – Wiktor Zychla Jul 20 '12 at 19:38
  • `It has been working fine until our most recent release.` What went into your most recent release - take a "quick" scan through your source code history. Might be a red herring, might be a red flag. – Wonko the Sane Jul 20 '12 at 19:47
  • Hi Wonko, yes, I would say that this is related to that other post, but the first post was asking about Application.DoEvents, and the more I explore the problem, the more it seems to be unrelated to Application.DoEvents(). I'm in a tricky spot, because while this is not SPECIFICALLY about application.DoEvents, the two issues do seem to be related. The same underlying "bug" has prompted both questions - but they are (IMHO at least) two separate questions. – eejai42 Jul 20 '12 at 20:15
  • This is a WinForms app. As far as what has changed, there are 2 months of changes, from 3 developers. If/when I get to that point, it will not unfortunately be in any way shape or form quick to review what has changed. I'm going to add to the original question with some new information that my most recent debugging has uncovered. – eejai42 Jul 20 '12 at 20:15
  • IMO, it is (as you point out) behaving exactly as if a message box or dialog is up. Start your search for any new instances or paths to these. – Wonko the Sane Jul 20 '12 at 20:20
  • That's a great idea Wonko. To test it, I had my app pop up a modal dialog box. Then, I added a breakpoint in the paint event of the main form. Even with the modal dialog active, when the debugger hit's the paint breakpoint, that paint event is at the top of the call stack. As described in the last edit of the main question - once the app has "hung", it still hits the breakpoints in the paint event, but the paint event (strangely) is NOT at the top of the callstack. "In Sleep, wait or join" is. The paint event line is Green, not yellow. Strange - right? – eejai42 Jul 20 '12 at 20:26
  • @WonkotheSane - what's the protocol here? The behavior I'm seeing with the app still responding, but showing "In sleep, wait or join" indicates there is SOME sort of threading problem, but no threads are active. The other question was specifically about Application.DoEvents. They are two separate questions, but at the same time, they seem to be related. Should I delete that other question or something? In the other question, I put a link to this question indicating that they might be related. Not sure of protocol as I'm somewhat new to SO. Thanks again for your help. – eejai42 Jul 20 '12 at 20:29
  • Your code is properly covered with try/cacth? Check this. – lsalamon Jul 20 '12 at 20:30
  • 2
    It's interesting that you haven't mentioned DoEvents in this post at all... If DoEvents never returns, that would explain why Application.Run "hangs"... – Peter Ritchie Jul 20 '12 at 20:30
  • @PeterRitchie - That's correct. Originally, I thought this was specifically related to DoEvents (and there does seem to be some connection) - but the more I explore the problem, the more it seems to not be directly related. Specifically about 4/5ths of the time, it does NOT hang on Application.DoEvents. In fact, it doesn't hang on ANY of my code, instead, when I attach the debugger, and hit Pause, it is on the main "Application.Run(form)" line. When I look at the call stack and Thread list, our code is nowhere to be found. See my previous update about breakpoints in the paint event. :( – eejai42 Jul 20 '12 at 20:35
  • 1
    Use the debugger to walk up the stack to see which event is looping on DoEvents. Btw: this whole question is a good argument for __never use `DoEvents()`__ – H H Jul 20 '12 at 20:43
  • @HenkHolterman - The whole reason I posted this question separately from my original question (http://goo.gl/RS6vn) about DoEvents is that this is not related to do events. Most of the time when the app gets into this state, it is NOT hung on a DoEvents line as I originally thought it was. It is hung on Application.Run(form) - and there IS NO STACK to walk. That's the whole problem. When this occurs, there is no code that we control anywhere in the threads or stack. It's just [Main] ... external code ... Application.Run(form) - and that's it. No other threads, no other stack elements. – eejai42 Jul 20 '12 at 20:52
  • I'm still investigating, but... this behavior appears to be the result of using the SQL 2008 vs. the 2005 report viewer. A few months back, we switched from report viewer v9 to v10. By repeatedly testing this behavior on previous builds of the app, we pinned it down to having started about 6 weeks ago when we switched viewers. We didn't notice it in house because it doesn't happen when debugging (and we don't spend a lot of time switching screen res.) I will explore further and post back when I have a complete picture of exactly what the behavior is... Thanks to everyone for their input! – eejai42 Jul 20 '12 at 22:52

1 Answers1

4

You may have a problem with control (or its handle) created on non UI thread, check this question.

Community
  • 1
  • 1
Arthur
  • 1,484
  • 1
  • 16
  • 18
  • THIS IS EXACTLY WHAT IT WAS!!! Thank you SOOOO MUCH!!! On one of the articles referenced in the other post it mentions creating UI elements on another thread. Below is the sequence of events that lead to this VERY bazaar behavior in case anyone else every runs into this type of thing in the future! – eejai42 Jul 25 '12 at 19:53
  • 1
    1) To ensure that the SSRS Service is running, when our app starts it spawns a worker thread to connect to the reporting service. 2) It used to use the WebForms version of the report viewer to just connect to a report. 3) On June 12th, one of our devs inadvertently changed one line: using Microsoft.Reporting.WebForms to Microsoft.Reporting.WinForms. This caused the the thread to create a WINDOWS ReportViewer on a Non-UI thread to make this initial connection. 95% of the time, there was no problem. ALL of the behavior described here, stems from this one word change! THANK YOU ALL AGAIN! – eejai42 Jul 25 '12 at 20:00
  • Try function in [this](https://stackoverflow.com/a/52721562/901333) answer to find out which particular controls were created on a wrong threads. – Vlad Rudenko Oct 13 '18 at 19:57