Apologies for the fairly generic nature of the question - I'm simply hoping someone can contribute some suggestions and/or ideas as I'm out of both!
The background: We run a fairly large (35M hits/month, peak around 170 connections/sec) site which offers free software downloads (stricly legal) and which is written in ASP .NET 2 (VB .Net :( ). We have 2 web servers, sat behind a dedicated hardware load balancer and both servers are fairly chunky machines, Windows Server 2012 Pro 64 bit and IIS 8. We serve extensionless URLs by using a custom 404 page which parses out the requested URL and Server.Transfers appropriately. Because of this particular component, we have to run in classic pipeline mode. DB wise we use MySQL, and have two replicated DBs, reads are mainly done from the slave. DB access is via a DevArt library and is extensively cached.
The Problem: We recently (past few months) moved from older servers, running Windows 2003 Server and IIS6. In the process, we also upgraded the Devart Component and MySql (5.1). Since then, we have suffered intermitted scalability issues, which have become significantly worse as we have added more content. We recently increased the number of programs from 2000 to 4000, and this caused response times to increase from <300ms to over 3000ms (measured with NewRelic). This to my mind points to either a bottleneck in the DB (relatively unlikely, given the extensive caching and from DB monitoring) or a badly written query or code problem. We also regularly see spikes which seem to coincide with cache refreshes which could support the badly written query argument - unfortunately all caching is done for x minutes from retrieval so it can't always be pinpointed accurately. All our caching uses locks (like this What is the best way to lock cache in asp.net?), so it could be that one specific operation is taking a while and backing up requests behind it.
The problem is... I can't find it!! Can anyone suggest from experience some tools or methods? I've tried to load test, I've profiled the code, I've read through it line by line... NewRelic Pro was doing a good job for us, but the trial expired and for political reasons we haven't purchased a full licence yet. Maybe WinDbg is the way forward?
Looking forward to any insight anyone can add :)