Implementing Accept for UDP

Question

UDP sockets have a 'connect' call, but do not have an 'accept' call for server applications. There are socket APIs that benefit performance-wise from a connected UDP socket (e.g. recvmmsg/sendmmsg) and is the best performing system call for a single-flow with very high packet rates (any thing higher requires kernel-bypass like DPDK).

Anyhow, I am unable to find a solution that implements accept for a UDP server, so my thoughts are to do the following:

Server socket listens until receiving a packet from a client
Server calls connect on the socket, thus allowing traffic to be sent to the client using the accelerated connected apis (i.e. connected sendmmsg over unconnected sendmmsg)
Server listens on a cloned socket of #1

In order to solve the accept thing, I am not sure how to clone #1. The user of my "server" library is passing in a socket file descriptor, meaning they have the control of what options they configured, (SO_RECVBUFF, etc) -- I don't have visibility in to it. Unfortunately, in order to clone it, I now need that visiblity.

Anyway, if there is another way to solve the accept thing, or clone the socket, I'd love to know! Thank you!

Are you using multicast? If not, multiple UDP sockets on the same port can behave unpredictably. — dbush, May 14 '21 at 21:48
This seems relevant: [**Can two applications listen to the same port?**](https://stackoverflow.com/questions/1694144/can-two-applications-listen-to-the-same-port) — Andrew Henle, May 14 '21 at 21:49
The `socket(7)` manual page lists available socket options, and how to get or set them. — Sam Varshavchik, May 14 '21 at 22:04
@dbush - not using multicast in this scenario. The purpose for this is to essentially implement a version of "accept", but for udp. When a server receives traffic from a client, it then calls 'connect' on the socket, thereby allowing sendmmsg to work to send traffic in bulk without providing the sender address in the system calls. However, once you connect, you no longer receive traffic from other clients. So you need to open another socket that is listening on the same address/port. That will receive connections from other clients and then the procedure is repeated — Gabe, May 14 '21 at 22:05
@SamVarshavchik - that is what I'm afraid I'll have to do, iterate through every permutation of socket option. Was hoping there was an easier way to clone a socket. — Gabe, May 14 '21 at 22:08
This smells [like an XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). What problem are you trying to solve? No, not the one about cloning a socket, but the problem to which you believe the solution is to clone a socket, so that's what you're asking about. Perhaps if the real problem is explained, a simpler solution will become possible. — Sam Varshavchik, May 14 '21 at 22:10
The real problem is that udp offers a "connect" but no "accept". So APIs that require a connected socket are a challenge to use in a server scenario. So the problem is this: I'd like to use sendmmsg to send traffic to clients that talk to my udp server. Once you call connect on the socket, you are limited to only communicating to that one client. If "accept" existed, like it does for TCP, you'd get a pseudo-cloned socket where the server continues listening on the original socket, and client communication is available with the "accept"ed socket. AFAIK - that problem has not been solved. — Gabe, May 14 '21 at 22:14
"*APIs that require a connected socket are a challenge to use in a server scenario. So the problem is this: I'd like to use sendmmsg*" That's the part that sounds like an XY problem. `sendmsg` does not require a connected socket. — kaylum, May 14 '21 at 22:28
@kaylum - correct. Sorry I should have said I want to use the connected socket due to the significant performance increases in using a connected socket for both transmit and receive. Using sendmmsg over sendmsg is huge, but so is using the connected socket and sendmmsg over the unconnected socket and sendmmsg. I have modified the question to include that information. — Gabe, May 14 '21 at 22:46
@Gabe You're confusing the semantics of "connect" in a TCP context with "connect" in UDP. TCP is by design a connection-oriented protocol, so "connect" in TCP literally establishes a connection, leaving the original port that was accepting connections effectively unchanged. UDP is ***connectionless*** - there is never a connection. When you send a UDP packet, it's completely independent from all other packets, and it's merely addressed to an IP/port combination. TCP is a phone call - you call and establish a connection. UDP is a letter - whoever goes to the mailbox at the address gets it. — Andrew Henle, May 14 '21 at 23:22
(cont) If you want to build a connection-oriented protocol **on top of** UDP, [you won't be the first](https://www.google.com/search?q=udp-based+data+transfer+protocol). But that's a ***lot*** more complex than "cloning" a port that's used to receive UDP packets. — Andrew Henle, May 14 '21 at 23:25
@AndrewHenle - I knew I would get multiple "Hey, udp is connectionless" answers. I'm not talking about that, but rather using the socket API. When a socket is "connected" -- in terms of the API, you don't have to provide the remote endpoint address. This enables optimizations in performance. The kernel doesn't have to do a route lookup on every datagram on a "connected" socket. I was afraid when asking this question application-level devs would jump on the "TCP" is for connections, UDP is not! I'm talking socket API, not protocol. I'm afraid most people commenting here don't understand that. — Gabe, May 15 '21 at 03:27

dbush · Accepted Answer · 2021-05-19T16:43:17.223

3

There's no built-in capacity to clone a socket. You would need to keep track of whatever options you set on one socket and set those on a new socket.

You have a larger problem however. If you have multiple UDP sockets open on the same port and a unicast packet comes in, only one of those sockets will receive it and you can't accurately predict which one it will be. So the whole concept of having a "listening" socket and multiple "accepted" sockets for UDP connections won't work.

Your program will need to have a single socket that handles all incoming packets and multiplexes them based on the sender, possibly with one thread to receive and one thread per client but try it without threads first.

EDIT:

Given the use of connected UDP sockets, it looks like you can have one unconnected socket as the listener and a connected socket for each remote endpoint you want to talk to. Since there's no clone function, this means any new socket you create to handle a different endpoint will briefly be in an unconnected state between when the socket is created and when connect is called on that socket. During that time, incoming packets not associated with a connected socket could come to the "listener" or it could come to this new socket before it connects. You'll need to handle this case in your application, most likely by having the connected socket drop unknown packets and by having clients retrying their initial "connection" until they receive a response.

edited May 19 '21 at 16:43

answered May 14 '21 at 23:04

dbush

205,898
23
218
273

During the connection process, there is a race condition, but after the socket is connected it is 100% deterministic. Once connect is called, the connected socket will ONLY receive packets from the remote client. The other non-connected socket will receive all other traffic. This is understood. Given there are other ways packets are lost during initial connection (see unres_qlen), this is understood and acceptable. – Gabe May 14 '21 at 23:15
@Gabe I don't think that's true. The thing that you're connecting isn't the socket, it's the endpoint. And if multiple sockets reference the same endpoint, they're either all connected or none of them are. Some documentation will say that you're connecting "the socket", but they're just being sloppy with terms. (That's obviously false for TCP, right?) What `connect` does is perform a "connect" operation on the endpoint the socket references, also connecting any other references to that same endpoint. – David Schwartz May 14 '21 at 23:18
@DavidSchwartz it is true... quoting from connect docs `If the socket sockfd is of type SOCK_DGRAM then addr is the address to which datagrams are sent by default, and the only address from which datagrams are received`. I have also tested and verified the documentation. – Gabe May 15 '21 at 03:21
@Gabe I don't understand why you think that contradicts what I said and supports what you said. That doesn't mean that another socket referencing the same endpoint won't also be connected. Sockets have *very* few properties and are basically a thin wrapper around a communications endpoint. Most property setting functions don't set the properties of the socket itself but of the endpoint it references and will also affect other sockets that reference the same endpoint. An endpoint cannot be cloned or duplicated. – David Schwartz May 15 '21 at 19:42
@DavidSchwartz - I think then you misread my first comment. I wasn't saying two connected sockets (to the same endpoint), but rather one that is connected and the other unconnected. The connected socket will only receive packets from the endpoint provided in the connect call. The "unconnected" socket will receive packets from all other endpoints (except those belonging to the "connected" remote endpoint). I have tested this exhaustively; and it aligns with docs. I do not want to have two sockets that are connected to/referencing the same endpoint. I agree with you on what "connect" does. – Gabe May 19 '21 at 15:06
@Gabe How did you manage to get two sockets (one connected and one unconnected) to the same endpoint? The endpoint is either connected or it isn't. And if they're to different endpoints, they cannot have the same local IP address and port. – David Schwartz May 19 '21 at 16:21
1

@DavidSchwartz You create one `SOCK_DGRAM` socket, set `SO_REUSEPORT`, and bind to IP 0.0.0.0 port *x*. Then you create another socket just like that one, then use `connect` on that socket to a given IP/port (ex IP1:p1). The latter socket will get packets from the connected IP/port, and the former socket will get all others. If you then create a third socket connected to IP2:p2, that socket will get packets from IP2:p2 and the unconnected socket will get packets from all but IP1:p1 and IP2:p2. – dbush May 19 '21 at 16:37
@dbush -- yay! Someone understands! I probably should have mentioned the use of SO_REUSEPORT in the original question. I see in your updated answer you see the race conditions as well; and there are a few. I wish the kernel api had created an `accept` as well for datagram sockets in order to solve that race condition. Either have `connect` and `accept` for udp or don't have either. Anyway, trying to eek out any tiny bit of performance as possible without having to resort to kernel bypass, which is why I have interest in using connected endpoints... – Gabe May 19 '21 at 21:50

John Bollinger · Answer 2 · 2021-05-14T23:46:01.890

You wrote,

The real problem is that udp offers a 'connect' but no 'accept'

, but no, the real problem is that UDP is not a connection-oriented protocol. As the POSIX specifications for connect() explain:

If the initiating socket is not connection-mode, then connect() shall set the socket's peer address, and no connection is made. For SOCK_DGRAM sockets, the peer address identifies where all datagrams are sent on subsequent send() functions, and limits the remote sender for subsequent recv() functions. [...] Note that despite no connection being made, the term ``connected'' is used to describe a connectionless-mode socket for which a peer address has been set.

UDP, again, is not a connection-mode protocol. In particular, it is a datagram protocol. If you want a pair of sockets that each have the other set as their peer then you must use connect() on both sides. There is no concept of a server socket for datagrams, in the sense of a factory for connected sockets, and even with a peer set, a UDP socket can still communicate with other endpoints.

If you want to emulate a server socket with UDP, then you need to start with a socket listening on a well-known port. Clients will send a message to that port, and expect the server to establish a new, separate socket on a different port for its end of the pseudo-connection. The server would respond from that socket to tell the client which port it is listening on.

You would want to use the contents of these initial messages to confirm each side's intent and to ensure that the server's initial response is correctly paired with with the intended "connection request". For example, perhaps the client's initial message is "CONNECT <PER-CONNECTION-UUID>", and server's initial response is "ACCEPTED <CLIENTS-CONNECTION-UUID>". The keywords in each confirm the intent of the messages, and matching UUIDs (or some other key) allow the client to match the server response with the right connection request.

You must also be aware that UDP is an unreliable protocol, however, and you must be prepared to accommodate that. UDP datagrams can be dropped or lost, and they can be received in a different order than they were sent. This is part of how it achieves better throughput than TCP. If you need to overcome those characteristics then you could do so by implementing own higher-level protocol on top of UDP, but at that point you are probably worse off than if you had just used TCP in the first place.

I'm not talking about the protocol. I'm talking about the API. I'm afraid most people answering here are confusing the two. "connect" in terms of the socket API means that when using any of the "send" variants, a remote endpoint doesn't need to be provided. When sending a million packets per second, that small optimization really adds up. Hence the desire to use the "connected" socket API. Please don't confuse the protocol with the API. — Gabe, May 15 '21 at 03:30
@Gabe, there is no system interface that provides for omitting the remote endpoint from calls to *all* `send` variants or to *all* `recv` variants, only to `send()` and `recv()` specifically. In any case, the majority of my prose in this answer is devoted to describing a way to implement a userspace API for what I think is about as close to what you want as is actually feasible. It is important to understand some of the details of the system's protocol implementation in order to understand how such an API can or should work. — John Bollinger, May 15 '21 at 13:23
John - that is incorrect. `send` actually requires a connected socket (as would `write`) since a remote endpoint cannot be provided as part of the interface. `Sendto, sendmsg, sendmmsg` all allow you to specify a remote endpoint, but if the socket is connected, then the remote endpoint should be omitted (e.g. `dest_addr` set to null). Docs quote: `For a connected socket, these fields should be specified as NULL and 0, respectively`. So... all send variants benefit performance-wise from a connected socket... In some cases, a connected socket is actually required (send/write). — Gabe, May 19 '21 at 14:57
What's incorrect, @Gabe? Yes, `send()` does not permit you to specify an endpoint in the first place. It therefore cannot be used with unconnected sockets, but can be used with connected ones. Yes, when the others are used with a connected socket, the remote endpoint should not be specified. None of this is inconsistent with what I said. — John Bollinger, May 19 '21 at 18:54
John - "there is no system interface that provides for omitting the remote endpoint from calls... only to send() and recv() specifically". -- That was incorrect. — Gabe, May 19 '21 at 21:41

Implementing Accept for UDP

2 Answers2