IIS ASP.net application suddenly hangs requests

Question

We have an asp.net application running on IIS 8.5, when there is high traffic, intermittently by about 1 minute, "requests queue" starts to grow and processor time goes down, look at the performance monitor graph:

The strange thing is that there is no application restart, 503 or IIS recycle occurring, so what could it be? What suddenly makes the IIS hang the requests for a while? Besides the graph, you can see that Windows's memory looks ok and stable.

Here is some environment configurations: APP Pool Queue length: 10000 (there is no 503, so I don't think could it be this)

Asp.net config:

 <system.web>
     <applicationPool maxConcurrentRequestsPerCPU="999999" />
 </system.web>

machine.config:

<processModel autoConfig="false" requestQueueLimit="250000"/>

We configured things this way because our application uses a lot of SignalR.

The application uses Azure SQL Server and Azure Redis, but this is not the problem since another Virtual Machine (with the same APP) doesn't show the problem at the same moment.

Another tip is that: In the same VM, we have the same APP but in another Application Pool and another domain that behaves equal.

Any help would be appreciated, this is driving me crazy.

Thanks!

Have you already looked for the problem inside of your application? Is there some way it could deadlock? Does it make connections to a database server? Is the database still available? — Peter, Nov 05 '16 at 11:38
Hi @Peter, the application is huge, we need some tips to start to look. It's not something to with database, since another virtual machine doesn't show this problem at the same moment. — Alexandre, Nov 05 '16 at 11:41
Try to find out what the application is doing when it hangs. The threads tab of process explorer might be an esay start: http://superuser.com/questions/462969/how-can-i-view-the-active-threads-of-a-running-program — Peter, Nov 05 '16 at 11:44
I added more info above. Do you think that threads are reaching to the limit and hanging application? — Alexandre, Nov 05 '16 at 11:47
I don't know what the reason might be, but it is a step to get to the root. — Peter, Nov 05 '16 at 12:04
Ok, but do threads can reach to the limit and start to hangs request? Is that possible? — Alexandre, Nov 05 '16 at 12:08
It's also worth checking the Windows EventViewer for any events that happen around that time. — Maloric, Nov 07 '16 at 11:43
Have you tried to switch off scale out? Redis scaleout is quite unstable for signalr. It's sad but truth. — Igor Lizunov, Nov 07 '16 at 16:05
Yes @IgorLizunov the problems started before we moved signalr to Redis. — Alexandre, Nov 07 '16 at 16:12
@Alexandre just checking, and no other scaleout options were chosen? Because your situation (if it is not app deadlock) is 99.9% scaleout problem. We faced the same issue recently. — Igor Lizunov, Nov 07 '16 at 16:18
It's a virtual machine @IgorLizunov, there is no scaleout option. — Alexandre, Nov 07 '16 at 17:38
Have you tried logging the execution time for each request? If the delay appears in these logs it could be a code issue, else you could focus on the configuration of the application. — Tasos K., Nov 09 '16 at 11:15
You can use application insights, but it will take time to add all the telemetry calls throughout your code https://azure.microsoft.com/en-us/documentation/articles/app-insights-asp-net-dependencies/ — zivkan, Nov 11 '16 at 03:45

score 1 · Answer 1 · answered Nov 10 '16 at 21:07

1

Did you follow the recommendations regarding configuring the threadpool growth settings that Redis provides, see here. Had similar issues.

answered Nov 10 '16 at 21:07

Jeroen

3,443
2
13
15

Yes, I described that in the question. – Alexandre Nov 10 '16 at 21:09

score 1 · Answer 2 · edited May 23 '17 at 12:07

In the environment you have described, I will look at below points

have a look at DTU percentage for Azure SQL server, if your signalr operations has anything to do with database. just try going one level up in DTU Scale to handle burst. DTU
for an app with signalr with multiple virtual machine (assuming load balanced), do you have central signalr server or each server behaving as signalr server ?

if you can install NewRelic or similar, it will point to source of problem.

Hope it helps.

score 1 · Answer 3 · answered Nov 11 '16 at 03:39

You could take a memory dump of the process when the problem is happening (one way is using Task Manager), although if there are multiple IIS app pools running on the machine, finding the correct process to take a memory dump of might be non-trivial.

If you have Visual Studio Enterprise, then it will give you a nice UI to analyse the dump. It will show you all the threads running the process and what the callstack was at the time of the memory dump. You'll probably find most of the .NET threads have a similar callstack, which would be the most likely cause of the bottleneck.

If you don't have Visual Studio Enterprise, then I think you can still open it with WinDbg, but it's a CLI tool and I don't know the commands, so you'll need to look it up.

score 1 · Answer 4 · answered Nov 13 '16 at 17:17

Add another counter to your graph: Private Bytes. This number will tell you the number of bytes in all in all managed heaps + the number of unmanaged bytes. If that number keeps increasing, then you have a memory leak somewhere. Ensure all disposable objects are disposed of properly and items are not surviving until G3 collection. Anyhow, I would start there and if this is the issue, I would start my investigation and see what is leaking.

IIS ASP.net application suddenly hangs requests

4 Answers4