0

I'm writing user-space application which among other functionality uses netlink sockets to talk to the kernel. I use simple API provided by open source library libmnl.

My application sets certain options over netlink as well as it subscribes to netlink events (notifications), parses it etc. So this second feature (event notifications) is asynchronous, currently I implemented a simple select() based loop:

...
fd_set rfd;
struct timeval tv;
int ret;

while (1) {
   tv.tv_sec = 1;
   tv.tv_usec = 0;
   FD_ZERO(&rfd);
   /* fd - is a netlink socket */
   FD_SET(fd, &rfd);

   ret = select(fd + 1, &rfd, NULL, NULL, &tv);
   if (ret < 0) {
     perror("select()");
     continue;
   } else if (ret == 0) {
      printf("Timeout on fd %d", fd);
   } else if (FD_ISSET(fd, &rfd)) {
       /*
            count = recv(fd, buf ...)
            while (count > 0) {
               parse 'buf' for netlink message, validate etc.
               count = recv(fd, buf)
            }

       */
   }
}

So I'm observing now that code inside else if (FD_ISSET(fd, &rfd)) { branch blocks at the second recv() call.

Now I'm trying to understand if I need to set the netlink socket to non-blocking (SOCK_NOBLOCK for example), but then I probably don't need select() at all, I simply can have recv() -> message parse -> recv() loop and it won't block.

Mark
  • 6,052
  • 8
  • 61
  • 129
  • 1
    If you don't use `select()`, your code will run in a tight loop continuously calling `recv()` and getting `EWOULDBLOCK` errors, unless you insert a sleep in the loop. – Barmar Aug 14 '20 at 22:32
  • 1
    There is a reason it blocks - there's no data to receive. You can certainly make the socket non-blocking but what result do you want that to achieve? You would have to either go back and call `recv` again until there is data or have logic that deals with the no-data case. So whether it is the right thing to do depends on what the desired behaviour is. – kaylum Aug 14 '20 at 23:05
  • Does this answer your question? [socket select ()versus non-block recv](https://stackoverflow.com/questions/19169378/socket-select-versus-non-block-recv) and [several others](https://www.google.com/search?q=site%3Astackoverflow.com+select+vs.+non-blocking). – Steffen Ullrich Aug 15 '20 at 04:47
  • It doesn't make much sense to use `select()` in blocking mode. You may as well just block in `recv()`. – user207421 Aug 15 '20 at 06:54

1 Answers1

0

... if I need to set the netlink socket to non-blocking ..., but then I probably don't need select() at all ...

Exactly this is the purpose of a non-blocking socket: Instead of doing the if(FD_ISSET(...)) you call recv() and evaluate the return value.

If you use blocking sockets, you must not call recv() more than once after calling select(); then the program is "effectively" non-blocking.

HOWEVER,

... as user "kaylum" already suggested in his comment, you'll have another problem in any case:

It is not guaranteed that one complete "message" is available at the same time. The other end of the socket might send the first part of the message, wait some seconds and then send the second part of the message.

However, select() will tell you that there is at least one byte available; it will not tell you if the complete message is available.

If you want to wait for the complete message in the inner loop (while(count > 0)), you will always have to wait (which means that your program has "effectively" a blocking behavior even if the socket is non-blocking).

If you simply want to process all bytes already available in the inner loop, then the condition count > 0 is wrong. Instead, you should do something like this if you are working with blocking sockets:

else if(FD_ISSET(...))
{
    while(FD_ISSET(...))
    {
        count = recv(...);
        if(count > 0)
        {
            ...
            select(...);
        }
        else FD_ZERO(...);
    }
}

However, in most cases this will not be necessary and you can simply process the "remaining" data bytes in the next "outer" loop.

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38