3

Consider following list of connections:

+----------+---------+------+------------------------+
| ID       | COMMAND | TIME | STATE                  |
+----------+---------+------+------------------------+
| 87997796 | Sleep   |   15 | cleaned up             |
| 90850182 | Sleep   |  105 | cleaned up             |
| 88009697 | Sleep   |   38 | delayed commit ok done |
| 88000267 | Sleep   |    6 | delayed commit ok done |
| 88009819 | Sleep   |   38 | delayed commit ok done |
| 90634882 | Sleep   |   21 | cleaned up             |
| 90634878 | Sleep   |   21 | cleaned up             |
| 90634884 | Sleep   |   21 | cleaned up             |
| 90634875 | Sleep   |   21 | cleaned up             |
+----------+---------+------+------------------------+

After some short time under minute:

+----------+---------+------+------------------------+
| ID       | COMMAND | TIME | STATE                  |
+----------+---------+------+------------------------+
| 87997796 | Sleep   |    9 | cleaned up             |
| 88009697 | Sleep   |   32 | delayed commit ok done |
| 88000267 | Sleep   |    9 | delayed commit ok done |
| 88009819 | Sleep   |   31 | delayed commit ok done |
| 90634882 | Sleep   |   14 | cleaned up             |
| 90634878 | Sleep   |   14 | cleaned up             |
| 90634884 | Sleep   |   14 | cleaned up             |
| 90634875 | Sleep   |   14 | cleaned up             |
+----------+---------+------+------------------------+
8 rows in set (0.02 sec)
enter code here

After I finished writing this stackoverflow post:

+----------+---------+------+------------------------+
| ID       | COMMAND | TIME | STATE                  |
+----------+---------+------+------------------------+
| 87997796 | Sleep   |    0 | cleaned up             |
| 88009697 | Sleep   |   53 | delayed commit ok done |
| 88000267 | Sleep   |    0 | delayed commit ok done |
| 88009819 | Sleep   |   52 | delayed commit ok done |
| 90634882 | Sleep   |    5 | cleaned up             |
| 90634878 | Sleep   |    5 | cleaned up             |
| 90634884 | Sleep   |    5 | cleaned up             |
| 90634875 | Sleep   |    5 | cleaned up             |
+----------+---------+------+------------------------+

Context:

This is some 3rd vendor app opening connections (source code isn't available to us, so we don't know details). We know that their connection management is awful , they know it as well. It is awful because connections leak which you can see in first table - 90850182. If others have their timers reset, then this one starts to age infinitely. In older versions of the app it would stay forever. In newer version it is eventually captured by a "patch" which vendor introduced , which effectively cleans connections after the x seconds you specify. So it's "a leak healing patch".

The problem:

We are hosting hundreds of such vendor apps and most of them have much more than 8 connections as they have more traffic. That results in disgusting number(talking thousands) of connections we have to maintain. About 80% of connections sit in "cleaned up" state and under 120 seconds (cleaned eventually by aforementioned configurable app parameter).

This is all handled by Aurora RDS and AWS engineers told us that if the app doesn't close properly connections the standard "wait_timeout" isn't going to work. Well, "wait_timeout" becomes useless decoration in AWS Aurora, but let us take it with Jeff in other thread/topic.

So regardless, we have this magic configurable parameter from third party vendor set on this obscure app which controls eviction of stale connections and it works.

The questions:

Is it safe to evict connections which are in "cleaned up" state immediately?

At the moment this happens after 120 seconds which results in huge number of such connections. Yet in the tables above you can see that the timers are reset meaning that something is happening to these connections and they are not entirely stale. I.e. connection pooling of the app "touches" them for further re-use?

I don't posses knowledge of connection pools inner guts as how they are seen from within database. Are all reserved connections of a connection pool by default are "sleeping" in "cleaned up" state?

So say if you start cleaning too much, you will fight connection pool aggressively creating more to replenish?

Or reserved connections have some different state?

Even if you don't fully understand the context I'd expect veteran DBA or connection pool library maintainer to help with such questions. Otherwise will get my hands dirty and answer this myself eventually, would try apache connection pool, hikari, observe them and try to kill their idle connections (simulating magic parameter) and try this 3rd party app connection with 0 seconds magic parameter, see if it still works.

Appreciate your time :bow:.

Aubergine
  • 5,862
  • 19
  • 66
  • 110
  • this seems more like an dba problem / server. When your website is godd programmed a connection only will be open,as long your wenbsite doesn't cloed it or the mysql connection wait timepout is surpassed. A connection pool would recycle them see https://stackoverflow.com/questions/4284194/terminating-idle-mysql-connections – nbk Jun 13 '20 at 23:46
  • Connection pools are over-rated. Is there any configuration to limit the number of connections? Crank it down to 0 or 1. – Rick James Jun 14 '20 at 00:27
  • I can't limit number of connections because 3rd party apps will stop working and return 500 spitting connection refused. – Aubergine Jun 14 '20 at 21:32

1 Answers1

1

The Answer

Yes, from AWS forum (https://forums.aws.amazon.com/thread.jspa?messageID=708499)

In Aurora the 'cleaned up' state is the final state of a connection whose work is complete but which has not been closed from the client side. In MySQL this field is left blank (no State) in the same circumstance.

Also from the same post:

Ultimately, explicitly closing the connection in code is the best solution here

From my personal experience as a MySQL DBA, and knowing that "cleaned up" represents a blank state, I'd definitely kill those connections.

Jesus Uzcanga
  • 276
  • 2
  • 6