3

I have single threaded server written in C that accepts TCP/UDP connections based on EPOLL and supports plugins for the multitude of protocol layers we need to support. That bit is fine.

Due to the single threaded nature, I wanted to implement a database layer that could utilize the same EPOLL architecture rather then separately iterating over all of the open connections.

We use MariaDB and the MariaDB connector that supports non blocking functions in it's API.

https://mariadb.com/kb/en/mariadb/using-the-non-blocking-library/

But what I'm finding is not what I expected, and what I was expecting is described below.

First I fire the mysql_real_connect_start() and if it returns zero we dispatch the query immediately as this indicates no blocking was required, although this never happens.

Otherwise, I fetch the file descriptor that seems to be immediate and register it with EPOLL and bail back to the main EPOLL loop waiting for events.

s = mysql_get_socket(mysql);

if(s > 0)
{
    brt_socket_set_fds(endpoint, s);
    struct epoll_event event;
    event.data.fd = s;
    event.events = EPOLLRDHUP | EPOLLIN | EPOLLET | EPOLLOUT;
    s = epoll_ctl(efd, EPOLL_CTL_ADD, s, &event);
    if (s == -1) {
        syslog(LOG_ERR, "brd_db : epoll error.");
        // handle error.
    }
...

So, then some time later I do get the EPOLLOUT indicating the socket has been opened.

And I dutifully call mysql_real_connect_cont() but at this stage it is still returning a non-zero value, indicating I must wait longer?

But then that is the last EPOLL event I get, except for the EPOLLRDHUP when I guess the MariaDB hangs up after 10 seconds.

Can anyone help me understand if this idea is even workable?

Thanks... Thanks... so much Thanks.

dave.zap
  • 491
  • 4
  • 13
  • An update : actually on mysql_real_connect_cont I'm getting the error "0x6643a7 "Lost connection to MySQL server at 'handshake: reading inital communication packet', system error: 11"" – dave.zap Jun 20 '16 at 07:42
  • An update : I dumped the sample code from the MariaDB site and it works fine so assume I'm going something wrong between the _start and _cont in my application code. – dave.zap Jun 20 '16 at 08:02

1 Answers1

0

OK for anyone else that lands here, I fixed it or rather un-broke it.

Notice that - from the examples - the returned status from _start / _cont calls are passed in as a parameter to the next _cont. Turns out this is critical.

The status contains flags MYSQL_WAIT_READ, MYSQL_WAIT_WRITE, MYSQL_WAIT_EXCEPT, MYSQL_WAIT_TIMEOUT, and if not passed to the next _cont my guess is you are messing with the _cont state-machine.

I was not saving the state of status between different places where _start and _cont were being called.

struct MC
{
    MYSQL *mysql;
    int status;
} MC;
...
// Initial call
mc->status = mysql_real_connect_start(&ret, mc->mysql, host, user, password, NULL, 0, NULL, 0);

// EPOLL raised calls.
mc->status = mysql_real_connect_cont(&ret, mc->mysql, mc->status);
if(mc->status) return... // keep waiting check for errors.
dave.zap
  • 491
  • 4
  • 13
  • This is incorrect. You should call epoll or select and then pass only the events that have actually occurred. You need the returned status only to know which events to wait for. See the example here: https://mariadb.com/kb/en/using-the-non-blocking-library/ – jcoffland Mar 25 '22 at 13:47