1

This error is familiar to anyone who ever tried to open a connection using 'localhost' (which transforms into a likely wrong socket path) but...

What could be randomly causing this exception on a TCP connection?

The (simplified) stack trace looks like this :

Uncaught exception 'mysqli_sql_exception'
With message 'No such file or directory' in /var/www/mysite/DB.php:88

#0 [internal function]: mysqli->real_connect()
#1 /var/www/mysite/DB.php(88): mysqli->query('SELECT * FR...')
#2 /var/www/mysite/OtherClass.php(668): DB::query('SELECT * FR...')

This happens after sucessfully making a bunch of other similar queries with THAT same mysqli object/connection, at a seemingly random time.

When this happens, the mysqli object contains the following information, some of which are clearly leftovers from the previous sucessful mysqli->query() call (such as affected_rows):

'affected_rows' => 14,
'client_info' => 'mysqlnd 5.0.11-dev - 20120503 - $Id: 76b08b24596e12d4553bd41fc93cccd5bac2fe7a $'
'client_version' => 50011
'connect_errno' => 2002
'connect_error' => 'No such file or directory'
'errno' => 0
'error' => ''
'error_list' => array ()
'field_count' => 3
'host_info' => '192.168.114.23 via TCP/IP'
'info' => NULL
'insert_id' => 0
'server_info' => '5.5.5-10.1.48-MariaDB'
'server_version' => 50505
'stat' => 'Uptime: 27634  Threads: 1075  Questions: 515109918  
           Slow queries: 92168  Opens: 601885  Flush tables: 1  
           Open tables: 2000  Queries per second avg: 18640.439'
sqlstate' => '00000'
protocol_version' => 10
thread_id' => 4057839
warning_count' => 0               

This happens on PHP 5.6.40 + mysqlnd, running on CLI mode (daemon) under systemd, and a mariadb cluster (4 servers) under moderate load, 1K to 3K sessions per server.

This happens on a script that has another mysqli object/connection open for the same server, but I don't see how that would be a problem. Connections are NOT on persistent (p:...) mode.

The connections are using different credentials and the one that fails does SET SESSION binlog_format = 'ROW' as first query: not sure how that could explain this either.

The error completely baffles me in the context of a TCP mysql connection.

EDIT: using the same credentials and binlog_format for both sessions does not solve the issue.

The code to set up the connections looks like this (for both singletons it is basically the same):

private static function connect()
{
    self::$conn = new mysqli('192.168.144.23', 'web', 'secret', 'products', 3306);
    self::$conn->query("SET SESSION binlog_format = 'ROW'");
    self::$conn->query('SET CHARACTER SET utf8, character_set_connection = utf8');
}

The code to make queries is equally standard

public static function query($query)
{
    if(self::$conn === null) {
        self::connect();
    }

    return $result = self::$conn->query($query)->fetch_all()
}

In the main daemon code, the two singletons are used in an alternated fashion, the order of use varies as it depends on the data:

while($processingLongListOfTasks) {
   ...
   if($someChangingCondition) {
     // exception after a few loops
     $results[] = DB::query("SELECT * FROM `table1`");
   }
   ...
   if($someOtherChangingCondition) {
     $results[] = OtherDB::query("SELECT * FROM `table2`");
   }
   ...
}
istepaniuk
  • 4,016
  • 2
  • 32
  • 60
  • 1
    Are you able to reproduce this error with a more recent version of MariaDB? MariaDB 10.1 is eoled. – Georg Richter Feb 18 '21 at 22:32
  • @GeorgRichter I could eventually try, but I suspect this is only happening in the context of this legacy production system under constant load, so upgrading is not that easy. I want to first stabilize it. The 'Open tables: 2000' bit seems suspicious to me, could be some sort of limit on the server side. – istepaniuk Feb 18 '21 at 22:45
  • 'Open tables' is anything from 1998 to 2015, so that's a dead end. – istepaniuk Feb 18 '21 at 23:01
  • @WillB. There is nothing on the mariadb logs (a systemd journal) when this happens. There are two involved singletons holding one connection each, using a different username each. Both connections seem to work ok until this exception happens. The configuration (parameters passed to `new mysqli()`) are always correct, read from the environment only once on startup. Just your usual plain strings. – istepaniuk Feb 18 '21 at 23:46
  • @GeorgRichter is having two mysqli instances to a same server potentially problematic? – istepaniuk Feb 18 '21 at 23:48
  • *is having two mysqli instances to a same server potentially problematic?* Typically no, as the connection client will send the request from a separate port number and the protocol will handle the request priority. Usually two connections are more performant, since the clients can be handled individually. However there are certain circumstances that can cause issues, such as running PHP as a daemon & memory leaks, conflicting configurations, or improper code usage. We would need to see more code/configs to determine specific issues. – Will B. Feb 19 '21 at 03:07
  • @WillB. `real_connect` is an internal extension function in the stack. Only `new mysqli()` is being used. I am adding some example code. – istepaniuk Feb 19 '21 at 12:03
  • Having tried the same, the issue is more than likely caused by running PHP as a daemon. PHP is notoriously bad at being run as a service, due to a [variety of issues](https://stackoverflow.com/a/647801/1144627) and memory leaks. In essence you are creating a persistent connection by using a loop to keep PHP running. I recommend switching to `supervisord` instead, which lets you queue the script like a cron job, that will exit and restart the script after it has run X number of times. Otherwise I recommend forwarding the request with something like Node.js to your HTTP server-side. – Will B. Feb 19 '21 at 18:48
  • Thanks for your input @WillB. As I say in my question, it runs under systemd. There is no (relevant) memory leak here, this happens seconds after startup. This processes RabbitMQ messages, and you can imagine that if it's PHP 5.6 and has not even been upgraded, replacing the whole thing is not cheap or easy. Forwarding to HTTP is interesting but It's also not easy to get the same performance, these scripts handle thousands of messages per second. – istepaniuk Feb 19 '21 at 18:53
  • I would try to investigate the reason for a large amount of open tables. If there are no file handles available anymore, server can't accept new connections. SHOW variables like 'table_open_cache' (maybe you should reduce it) but also results from show status, show processlist and show open tables would be useful. – Georg Richter Feb 19 '21 at 21:33

0 Answers0