Very very strange issue here... Apologies in advance for the wall of text.
We have a suite of applications running on an EC2 instance, all connecting to an RDS instance.
We are hosting the staging and production applications on the same EC2 server.
With one of the applications, as soon as the staging app is moved to prod, over 250 or so connections to the DB are opened, causing the RDS instance to max out CPU usage and make the entire suite slow down. The staging application itself does not have this issue.
The issue can be replicated by both deploying the app via our Octopus setup, and also physically copy pasting the BIN/Views folder from staging to live.
The connections are instant, boosting the CPU usage to 99% in less than a minute.
Things to note...
Running how to see active SQL Server connections? will show the bulk connections, none of which have a LoginName.
Resource monitor on the FE server will list the connections, all coming from a IIS, seemingly scanning all outbound ports, attempting to connect to the DB server on its port. FE server address and DB server address blacked out respectively. Only a snippet of all all of the connections.
The app needs users to log in to perform 99.9% of tasks. There is a public "Forgot your password" method that was updated to accept either a username or password. No change to the form structure or form action URL, just an extra check in the back.
Other changes were around how data was to be displayed and payment restrictions under certain conditions. Both of which require a login.
Things I've tried...
- New app pools
- Just giving it a few days to forget this ever happened
- Not using Octopus to publish
- Checking all areas that were updated between versions to see if a connection was not closed properly.
Really at a loss as to what is happening. This is the first time that I've seen something like this. Especially strange that staging is fine, but the same app on another URL/Connection string fails so badly.
The only think I can think of would potentially be some kind of scraper that is polling the public form, but that makes no sense as why isn't it happening with the current app...
Is there something in AWS that can monitor the calls that are being made? I vaguely remember something in NewRelic being able to do so.
Any suggestions and/or similar experiences are welcomed.
Edits.
- Nothing outstanding in logs for the day of the issue (yesterday)
- No incoming traffic to match all of the outbound requests
- No initialisation is performed by the application on startup
Update...
We use ADO for most of our queries. A query was updated to get data from different tables. The method name and parameters were not changed, just the body of the query. If I use sys.dm_exec_sql_text to see what is getting sent to the DB, I can see that is IS the updated query that is being sent in each of the hundreds of connections. They are all showing as suspended though... Nothing has changed in regards to how that query is sent to the server, just the body of the query itself...