1

I have two UDP sockets bound to the same address and connected to addresses A and B. I have two more UDP sockets bound to A and B and not connected.

This is what my /proc/net/udp looks like (trimmed for readability):

  sl  local_address rem_address
 3937: 0100007F:DD9C 0300007F:9910
 3937: 0100007F:DD9C 0200007F:907D
16962: 0200007F:907D 00000000:0000
19157: 0300007F:9910 00000000:0000

According to connect(2): "If the socket sockfd is of type SOCK_DGRAM, then addr is the address to which datagrams are sent by default, and the only address from which datagrams are received."

For some reason, my connected sockets are receiving packets that were destined for each other. eg: The UDP socket connected to A sends a message to A, A then sends a reply back. The UDP socket connected to B sends a message to B, B then sends a reply back. But the reply from A arrives at the socket connected to B and the reply from B arrives at the socket connected to A.

Why on earth would this be happening? Note that it happens randomly - sometimes the replies arrive at the correct sockets and sometimes they don't. Is there any way to prevent this or any situation under which connect is supposed to not work?

Shum
  • 1,236
  • 9
  • 22
  • 1
    Here's a python example to reproduce this behavior: https://gist.github.com/povilasb/53f1c802dbc2aca36a0ffa5b4cb95536 – PovilasB Aug 02 '18 at 06:26

1 Answers1

0

Ehm, as far as I can see there is no ordering guarantee.

From the man page:

      SO_REUSEPORT (since Linux 3.9)
              Permits multiple AF_INET or AF_INET6 sockets to be bound to an identical socket address.  This option must be set on each socket (including the first socket) prior to calling bind(2) on the socket.  To prevent port  hijacking,  all
              of the processes binding to the same address must have the same effective UID.  This option can be employed with both TCP and UDP sockets.

              For  TCP  sockets,  this  option allows accept(2) load distribution in a multi-threaded server to be improved by using a distinct listener socket for each thread.  This provides improved load distribution as compared to traditional
              techniques such using a single accept(2)ing thread that distributes connections, or having multiple threads that compete to accept(2) from the same socket.

              For UDP sockets, the use of this option can provide better distribution of incoming datagrams to multiple processes (or threads) as compared to the traditional technique of having multiple processes compete to receive datagrams  on
              the same socket.

So you're using something that is mainly seen as a option for servers (or in some cases clients, but ordering can never be guaranteed - especially in UDP) as a client.

I suspect your approach is wrong, and needs a rethink =)

PS. Just had a quick glance but IMHO it's a bug in your approach

  • The fact that the sockets are connected though means that they should only be receiving datagrams from the address they are connected to. At least according to `man connect`. – Shum Aug 03 '18 at 05:42
  • 1
    UDP doesn't track sessions, the packets will be delivered to either process - which is the purpose of SO_REUSEPORT as i see it – Ian Kumlien Aug 03 '18 at 14:32
  • The man page for connect says that a connected UDP socket should only receive datagrams from the address that it's connected to. This isn't the behaviour I'm seeing. Note that this code works fine on Windows. – Shum Aug 03 '18 at 15:09
  • This is from the man page: "For UDP sockets, the use of this option can provide better distribution of incoming datagrams to multiple processes (or threads) as compared to the traditional technique of having multiple processes compete to receive datagrams on the same socket." The fact that it works on windows might just be that you are lucky about data-structure ordering ie it's not something you should rely on. – Ian Kumlien Aug 03 '18 at 16:00
  • Yes but what about the fact that (quoting the connect man page again): *"If the socket sockfd is of type SOCK_DGRAM, then addr is the address to which datagrams are sent by default, **and the only address from which datagrams are received**."* Are you saying the connect man page is wrong? None of the information you've linked suggests this. – Shum Aug 04 '18 at 07:24
  • No, it's not wrong, it's how it works until you enable SO_REUSEPORT - since it actually changes the behavior as documented in man 7 socket – Ian Kumlien Aug 05 '18 at 09:33
  • No, it's not documented. Perhaps it should be but it's not. man 7 socket says that binding multiple sockets using SO_REUSEPORT **can** provide distribution of incoming datagrams - and that's exactly what I'd expect if both sockets were unconnected and able to receive datagrams from anywhere. But it does not say that SO_REUSEPORT disables the rule specified in man 2 connect that a socket should only receive datagrams from the address it is connected to. Nor is there any contradiction between the two rules we're quoting. Also this code works reliably on windows. – Shum Aug 07 '18 at 06:17
  • SO_REUSEPORT apparently doesn't exist on Windows, it apparently uses SO_REUSEADDR instead as stated by the link I posted above. I'll echo Willem de Bruijn here, "Then this is working as intended." – Ian Kumlien Aug 07 '18 at 08:10
  • Another great explanation: https://stackoverflow.com/questions/14388706/socket-options-so-reuseaddr-and-so-reuseport-how-do-they-differ-do-they-mean-t/14388707#14388707 – Ian Kumlien Aug 07 '18 at 08:21