1

I am using ASP.NET MVC 5 with MySQL (using Connector/Net). I can connect to the Database and everything works fine... but once the server gets busy, I get the following exception:

MySql.Data.MySqlClient.MySqlException (0x80004005): Unable to connect to any of the specified MySQL hosts.

I am using Ninject to inject the DbContext in the MVC Controller:

Bind<ApplicationDbContext>().ToSelf().InRequestScope();

I have 3 instances of MySQL server in a Galera Cluster, and this is my ConnectionString:

<add name="MyDB" connectionString="Server=IP1,IP2,IP3; Port=3306; Database=dbname; Uid=u1; Pwd=pswd1;" providerName="MySql.Data.MySqlClient" />

This exception generally happens when a bot starts crawling the website.

I have seen this question which is similar however the problem is that they can never connect to MySQL (I believe their ConnectionString is incorrect) but this answer (on the same question) seems to be describing the problem that I am having:

If your connection works initially, but you begin seeing this error after many successful connections, it may be this issue.

In summary: if you open and close a connection, Windows reserves the TCP port for future use for some stupid reason. After doing this many times, it runs out of available ports.

The answer describes the problem perfectly, however their solution is not really helpful:

  1. They suggest to use Connection Pooling (Connector/Net uses Connection Polling by default - so this is not helpful)
  2. They suggest to change Windows settings and assign more ports:

netsh int ipv4 set dynamicport tcp start=10000 num=50000

This document also explains a similar error.

However, I am not sure if this issue is caused because Windows is running out of ports?

Has anyone encountered a similar issue?

Here is the stack trace for the error that I am getting:

Exception: UserId: 0 Url: http://www.my-site.com/house-rental/display/109149 
UserAgent: facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) 
Message: The underlying provider failed on Open. 
InnerException: MySql.Data.MySqlClient.MySqlException (0x80004005): Unable to connect to any of the specified MySQL hosts. 
    at MySql.Data.Failover.FailoverManager.AttemptConnection(MySqlConnection connection, String originalConnectionString, String& connectionString, Boolean mySqlPoolManager) 
    at MySql.Data.MySqlClient.MySqlConnection.Open() at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext](TTarget target, Action`2 operation, TInterceptionContext interceptionContext, Action`3 executing, Action`3 executed) 
    at System.Data.Entity.Infrastructure.Interception.DbConnectionDispatcher.Open(DbConnection connection, DbInterceptionContext interceptionContext) 
    at System.Data.Entity.Core.EntityClient.EntityConnection.Open() Exception: System.Data.Entity.Core.EntityException: The underlying provider failed on Open. ---> MySql.Data.MySqlClient.MySqlException: Unable to connect to any of the specified MySQL hosts. 
    at MySql.Data.Failover.FailoverManager.AttemptConnection(MySqlConnection connection, String originalConnectionString, String& connectionString, Boolean mySqlPoolManager) 
    at MySql.Data.MySqlClient.MySqlConnection.Open() at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext](TTarget target, Action`2 operation, TInterceptionContext interceptionContext, Action`3 executing, Action`3 executed) 
    at System.Data.Entity.Infrastructure.Interception.DbConnectionDispatcher.Open(DbConnection connection, DbInterceptionContext interceptionContext) 
    at System.Data.Entity.Core.EntityClient.EntityConnection.Open() --- End of inner exception stack trace --- 
    at System.Data.Entity.Core.EntityClient.EntityConnection.Open() at System.Data.Entity.Core.Objects.ObjectContext.EnsureConnection(Boolean shouldMonitorTransactions) 
    at System.Data.Entity.Core.Objects.ObjectContext.ExecuteInTransaction[T](Func`1 func, IDbExecutionStrategy executionStrategy, Boolean startLocalTransaction, Boolean releaseConnectionOnSuccess) 
    at System.Data.Entity.Core.Objects.ObjectQuery`1.<>c__DisplayClass41_0.b__0() at System.Data.Entity.Core.Objects.ObjectQuery`1.GetResults(Nullable`1 forMergeOption) 
    at System.Data.Entity.Core.Objects.ObjectQuery`1..GetEnumerator>b__31_0() at System.Data.Entity.Internal.LazyEnumerator`1.MoveNext() 
    at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable`1 source) 
    at my-site.Infrastructure.Main.UmbrellaServices.AdPersister`1.GetActiveOrInactiveAd(Int64 adBaseId) 
    in C:\my-site\my-site.source\my-site.Infrastructure.Main\UmbrellaServices\AdPersister.cs:line 46 
        at my-site.Application.Main.Services.AdService`2.GetActiveOrInactiveAd(Int64 adBaseId) 

Update

As suggested by @MarcAlff, I ran the following:

SELECT * from performance_schema.host_cache;

and here is the result:

| IP      | HOST                                       | HOST_VALIDATED | SUM_CONNECT_ERRORS | COUNT_HOST_BLOCKED_ERRORS | COUNT_NAMEINFO_TRANSIENT_ERRORS | COUNT_NAMEINFO_PERMANENT_ERRORS | COUNT_FORMAT_ERRORS | COUNT_ADDRINFO_TRANSIENT_ERRORS | COUNT_ADDRINFO_PERMANENT_ERRORS | COUNT_FCRDNS_ERRORS | COUNT_HOST_ACL_ERRORS | COUNT_NO_AUTH_PLUGIN_ERRORS | COUNT_AUTH_PLUGIN_ERRORS | COUNT_HANDSHAKE_ERRORS | COUNT_PROXY_USER_ERRORS | COUNT_PROXY_USER_ACL_ERRORS | COUNT_AUTHENTICATION_ERRORS | COUNT_SSL_ERRORS | COUNT_MAX_USER_CONNECTIONS_ERRORS | COUNT_MAX_USER_CONNECTIONS_PER_HOUR_ERRORS | COUNT_DEFAULT_DATABASE_ERRORS | COUNT_INIT_CONNECT_ERRORS | COUNT_LOCAL_ERRORS | COUNT_UNKNOWN_ERRORS | FIRST_SEEN          | LAST_SEEN           | FIRST_ERROR_SEEN    | LAST_ERROR_SEEN     |
+--------------+-------------------------------------------------+----------------+--------------------+---------------------------+---------------------------------+---------------------------------+---------------------+---------------------------------+---------------------------------+---------------------+-----------------------+-----------------------------+--------------------------+------------------------+-------------------------+-----------------------------+-----------------------------+------------------+-----------------------------------+--------------------------------------------+-------------------------------+---------------------------+--------------------+----------------------+---------------------+---------------------+---------------------+---------------------+
| 1.2.3.4 | ip-1-2-3-4.ap-southeast-2.compute.internal | YES            |                  0 |                         0 |                               0 |                               0 |                   0 |                               0 |                               0 |                   0 |                     0 |                           0 |                       26 |                   4010 |                       0 |                           0 |                           8 |                0 |                                 0 |                                          0 |                             0 |                         0 |                  0 |                    0 | 2020-06-14 11:55:18 | 2022-02-15 07:00:13 | 2020-06-14 23:11:17 | 2022-02-15 01:00:02 |

These are the columns with error:

COUNT_AUTH_PLUGIN_ERRORS: 26
COUNT_HANDSHAKE_ERRORS: 4010
COUNT_AUTHENTICATION_ERRORS: 8
Hooman Bahreini
  • 14,480
  • 11
  • 70
  • 137
  • Just a note on the document you linked on how to reproduce the error - although connection pooling is enabled by default, the connection string in that doc is disabling that feature (pooling=false) so this isn't a valid repro unless that's what you're doing. If you are adding that to your connection string, then don't :) It would help to have an example of how you are using the connection. Are you sharing a connection across threads or creating new connection per request? – emagers Feb 11 '22 at 23:14
  • @EricMagers: I am using Ninject to inject `DbContext` (updated the question). As I explained in the question, `connection pooling` does not solve the error. So if you try the code in the document with connection pooling turned on, you should still get the same error. – Hooman Bahreini Feb 11 '22 at 23:35
  • Do you have multiple server? Or do you use `dns-srv` in connection string? – shingo Feb 14 '22 at 06:54
  • You could lower the [TcpTimedWaitDelay](https://learn.microsoft.com/en-us/biztalk/technical-guides/settings-that-can-be-modified-to-improve-network-performance) but, as I am no expert in networking, I do not know what other problems can get into picture when doing this. – Luuk Feb 14 '22 at 14:29
  • @shingo: I have 3 instances of MySQL in a Galera Cluster, I have updated the question. – Hooman Bahreini Feb 14 '22 at 19:59
  • 1
    My guess is that an actual exception is happening here https://github.com/mysql/mysql-connector-net/blob/3f7a5259b9809f22872dfd8660dc1ca9fc2326f3/MySQL.Data/src/Failover/FailoverManager.cs#L202-L205, MySql.Data is swallowing it, then throwing an unhelpful exception at the end of the `AttemptConnection` method. If we knew the actual exception that was occurring, that would likely help us troubleshoot the issue. It might help to create a TraceListener https://dev.mysql.com/doc/connector-net/en/connector-net-programming-tracing.html and add `;Logging=true` to the connection string. – Bradley Grainger Feb 14 '22 at 20:41
  • You might use multiple connection strings and set one server for each one. Bradley has commented the reason, but using a TraceListener won't help, because the exception is thrown before it starts logging. – shingo Feb 15 '22 at 09:12
  • @shingo: I don't follow. Are you suggesting that I should change my connection string to solve the issue? But how? – Hooman Bahreini Feb 15 '22 at 09:50
  • Not to solve the issue, but to know the exact exception. – shingo Feb 15 '22 at 09:51

3 Answers3

1

A client seeing a failed connection is only one side of the story.

To troubleshoot this, the best is to look at the other side, and see what the server says about failed connections:

SELECT * from performance_schema.host_cache;

In particular, this table will show statistics on each known root cause.

See the reference manual: https://dev.mysql.com/doc/refman/8.0/en/performance-schema-host-cache-table.html

Marc Alff
  • 8,227
  • 33
  • 59
  • In addition to above i would set log_warnings = 2 to log aborted connections and access denied details. looking at this information can help you to pin point the exact issue – Vilsad P P Feb 14 '22 at 14:37
  • @MarcAlff: thanks, I have run the above query and included the result in the question. – Hooman Bahreini Feb 14 '22 at 20:07
0

Firstly I'd like to say that I feel that you've researched your case thoroughly.

The suggestion about reserved ports definitely makes sense. When a connection had been opened Windows reserves the port in case the same client continues. Windows has no way of telling whether the exchange is finished.

If we were in an open source OS we would be trying to find out how the mechanism of port reservation works.With Windows we don't have too much visibility into the inner mechanisms. You could try to reduce the time that the port is reserved. Please see this link https://support.solarwinds.com/SuccessCenter/s/article/Tweaking-Windows-Server-performance-to-prevent-port-exhaustion-when-using-Orion-modules?language=en_US.

As you've searched all the other avenues and nothing else is working you should try opening up more port numbers. The commande netsh int ipv4 show dynamicport tcp will show the ports currently reserved. It seems that the default range is 49152 to 65535. Is any if this range blocked by your firewall?

The following Microsoft Docs makes the point that dynamic ports for IPR and ip6 are allocated differently: https://learn.microsoft.com/fr-fr/troubleshoot/windows-server/networking/default-dynamic-port-range-tcpip-chang. Do you have the possibility to have part of your connections using ip6?

It may sound like an exaggeration to allow an extra 40 000 open ports as suggested but what have you got to lose?

  • Today you face a problem, connections are being refused. You have a possible solution. Try opening the ports 10000-50000 for 24 hours and see what happens. You are not taking a security risk. It is totally reversible. You have nothing to lose.
    If this solves the problem you can then try reducing the permitted range, probably using the higher end of the range, but before talking about optimising you want to know whether this can solve the problem.
DYNAMIC PORTS

When a client connects on port 3306 the computer assigns a dynamic port to the connection, which is communicated to the client.

A dynamic port -- also called a private port -- is one that is assigned to a process or service at the time the port is needed, usually when the process or service is started. When assigning dynamic ports, the OS can use any ports available from the range of ports designated for this purpose.

I recommend that you read more on the subject. I have found a good explanation here https://www.techtarget.com/searchnetworking/definition/dynamic-port-numbers?amp=1.

HARDWARE OR BROADBAND CAUSES I know it sounds obvious have you checked your servers performance and broadband capacities v. use at peak times? If hardware or internet are the limiting factor modifying port settings and MYSQL configuration can't solve it.

  • I don't fully understand the port suggestion... My connection string is configured to connect to MySQL on port 3306, so I am not able to understand how opening other ports help sole the issue? – Hooman Bahreini Feb 21 '22 at 01:30
  • I've added a response in my answer. –  Feb 21 '22 at 05:26
  • Thanks for your description about the ports. According to the document that you have included, the newer versions of Windows assign more dynamic ports by default... I believe assigning more ports, would have helped with the old Windows system but I don't think it is the case for modern Windows system. I think this is a bug caused by Connector/Net – Hooman Bahreini Feb 21 '22 at 21:25
  • May I suggest that you try the suggestions? Even if they don't work you could upvote for the effort to help. –  Feb 21 '22 at 21:34
0

I believe this issue is caused because of a bug in MySQL connector/net which affects multi-host connection strings.

The best solution was to use MySQLConnector, however this solution does not support Entity Framework 6, which is a dependency in my project.

The way I resolved the issue is to use Amazon RDS instead of Galera Cluster... the difference is when using RDS, you will have only one host (RDS takes care of redundancy behind the scene)... so I no longer need to use a multi-host connection string.

Hooman Bahreini
  • 14,480
  • 11
  • 70
  • 137