Lock convoys in ASP.NET website

Question

TL;DR: Made refactoring for performance, website got slower. Ran the Concurrency Visualizer, the graph looks like the lock convoys as described on MSDN.

Context

I’m helping with refactoring an ASP.NET website to switch user controls from performing business logic on datasets to perform presentation logic on business objects and also reduce database calls made from the user controls.

The issue

We have noticed a significant performance drop (hangs/blockings) after introducing changes involving what we thought would be performance improvements in multiple areas.

We’re using Lean Sentry to monitor our websites’ performance. According to the hang diagnostics, the thread pool was running out of threads and (according to the descriptions on the diagnostics page) when GC runs, it stops more threads from being created. The GC Heap and Gen 0 were consuming a lot of memory (~ 9 GB), according to the memory diagnostics.

What I did so far?

I used memory profiler in Visual Studio and identified issues with our excessive DataAdapter and DataTable usage. Memory consumption dropped to 3 GB but that only helped with GC blocking. It is still slower than it had been before we introduced the changes and I still see blocking on high load caused by functions like CompilationLock.GetLock() and BuildManager.GetBuildResultFromCacheInternal(). Googling them didn’t return anything useful.
This is a website that uses JIT compilation. I had assumed that the issue with CompilationLock might be because of JIT compiling and wanted to run the website precompiled, but one of our global Utilities classes caused ambiguity with some other Utilities class/namespace that I don’t know of. I’ve found out that there is a Microsoft.Build.Utilities namespace, but it’s not referenced in our website and I can’t reproduce the ambiguity in my own environment when I reference Microsoft.Build myself, so I couldn’t get the website running on precompiled mode on the staging server to test this theory.
I made additional changes on memory allocation and the amount of database calls, using Visual Studio’s memory allocation and instrumentation profilers as a measure, but I didn’t notice any progress on performance.
I used a concurrency profiler to gather more information on thread utilization. I haven’t used this tool before, so I’m not sure about my interpretations here. There are multiple threads in each handle and in one handle I’m seeing 42% contention. I see that the DataAdapter.Fill and SqlHelper.ExecuteReader methods show up most when it’s set to “Show Just My Code” and WaitForSingleObjectExImplementation shows up most when it’s set to “Show All Code”.
I encountered a SO question about ASP.NET websites’ performance issues and set EnableSessionState="ReadOnly" for each page, but I didn’t notice difference with this change, either.
Concurrency Visualizer and Common Patterns for Poorly-Behaved Multithreaded Applications helped me identify the issue. My graph doesn’t seem like serial execution, but I see 80–90% synchronization as shown in Lock Convoys graph. I checked out a SO question on lock convoys debugging, too.

Testing approach

I’m using Screaming Frog to crawl the website in order to reproduce the issues and taking numbers of requests per second and response times in both Screaming Frog and Lean Sentry as a performance measure. It might not be the best way but the difference is noticeable, reproducible and it’s pretty much all I have at this point.

Architecture of the website

The website was originally coded in VB.NET for .NET Framework 1.0 about 10 years ago, and upgraded to .NET Framework 4.6.1 by fixing some compatibility issues. There haven’t been any architectural changes so far. There is a shared SqlHelper class, which is a collection of shared data access functions like ExecuteDataset or ExecuteDatareader, that return either a DataSet, DataReader or String value. These functions read the connection string information from the web.config file and create a new SqlConnection, SqlDataAdapter, SqlDataReader and SqlCommand object to perform the database operations. The Data Access Layer that consumes this shared class consists of classes for each module like shopping cart, category, product, etc. to be instantiated in each user control and they consist of functions that represent stored procedures in the database.

The refactoring

We have introduced some new objects to be instantiated either inside page load of the related user control, or inside OnItemDataBound event of repeaters and attached to its child user controls’ public properties, which are refactored to use the object. However, there are still other child user controls that need multiple data tables, so we decided to store one of the data tables in one of the objects and pass it to related user controls by assigning it to their public properties.

I guess that we hurt performance by introducing these objects. Even though database calls and memory consumption seem to be reduced, I’m wondering if the objects are causing threads to be synced all the time.

The graph before any refactoring happened:

The graph after all the refactoring I mentioned applied:

Will you help me identify the issue?

score 0 · Answer 1 · edited Nov 18 '16 at 19:00

Your problem is rather complex. I think that you have two basic options to resolve your refactoring performance issues:

Revert changes to the code to a point where all or much of the refactoring hadn’t yet been done and when you had better performance than what you are currently experiencing. Then, proceed gradually with the addition of new classes for performance improvements. If a change does not improve performance, then undo it and try something else.
Replace some / much of the newly added classes with versions that support the interfaces but lack the performance overhead. Do this selectively to isolate where the performance issues exist. Perhaps, the website has tapped into an unknown performance bug that was not triggered by prior implementations of the added classes.

I would favor option 1, though it may seem counterproductive. It is a bit like punting in U.S. football. Sure, it is nice to just drive down the field. But sometimes the dominant strategy is to punt, get the ball back and try to score on another drive.

Hi @JohnH. Thank you for your response. I've tried the 1st approach with 3 different versions of the code and it seemed like more objects we introduced more drop in requests per second. Then I've tried storing those objects in cache in order to avoid database calls and it showed no improvement at all. Then I figured my testing approach by looking at requests per second was not helpful since it's directly effected by the size of the page which I confirmed by crawling only the pages with output cache. We deployed the code and it seems to be in a better condition then we expected. — Engin, Nov 09 '16 at 15:53
I don't know how I should test at this point except counting on metrics gathers from profilers. — Engin, Nov 09 '16 at 15:55