3

I have a dedicated server build of a in-development realtime multiplayer game that I'd like to run on a Kubernetes cluster. The game uses WebRTC RTCDataChannels for network communication as it targets browsers. When running the game server on a Kubernetes node players can connect directly to it if I configure my Pods with hostNetwork: true. In the absence of hostNetwork: true players can still connect directly but only if they're behind well behaving NATs, if they're not then the only option for a successful connection is to introduce a TURN server to relay the traffic.

The reason for this appears to be that the game server runs on a Kubernetes node behind what is essentially a symmetric NAT i.e. the NATs mapping behaviour is address and port dependent. I've confirmed this by firing two STUN messages at different STUN servers from the same UDP socket within the container - the binding responses have the same public IP but different ports.

This NAT behaviour impacts WebRTC by reducing the success rate of players connecting directly to the game server in the absence of using a TURN server to relay traffic. Now I'd very much like to avoid relaying traffic through a TURN server where possible due to the added latency - it's not overly desirable for a realtime multiplayer game.

I'd have liked to just set hostNetwork: true and be done with it but I'm considering using Agones which doesn't support it (as it takes away the ability to securely do sidecars).

So I'm wondering if I have any other options (ideally without introducing yet another server to relay traffic through) to tweak this NAT behaviour as I don't believe it's feasible to try to forward the full range of random UDP ports that WebRTC is going to try to use for communication?


Update: Reading through the TURN RFC and I'm now thinking the below diagrammed setup may be possible using a TURN server running in the same container as the dedicated game server (thus latency introduction should be minimal).

Game Server + TURN Server + Kubernetes

The game clients will act as the TURN Clients i.e. create the allocation on the TURN server, etc.

The Game Server will act as the peer with which the clients want to communicate. The Game Server itself wouldn't need to be a TURN client and if my understanding is correct not even need to know that the TURN server is a TURN server.

dbotha
  • 1,501
  • 4
  • 20
  • 38
  • 1
    Also see https://stackoverflow.com/questions/64232853/how-to-use-webrtc-with-rtcpeerconnection-on-kubernetes – Jonas Jul 13 '21 at 08:12

3 Answers3

3

Indeed, a TURN server can be deployed in the backend to allow exposing only a single UDP and/or TCP port to WebRTC clients. It's not the classic usage but it's something I've seen in production a couple times. Only the client requires the TURN server to be set up, the game server will communicate directly with the TURN server as it'll relay on the local network address. Additionally, you should force ICE relay on the client.

Another option could be to set a port range in the WebRTC agent configuration and forward the port range on the NAT. In that case, instead of setting a STUN server, you can manually override emitted host candidates with the external address to optimize connection establishment.

  • 1
    Thanks Paul-Louis, this solution works for me. That said, for anyone else solving the same problem - consider if your WebRTC library supports listening on a single UDP/TCP port first. If so I suggest going with [Sean's](https://stackoverflow.com/a/68349888/406920) instead as you wont be introducing another point of failure. – dbotha Jul 12 '21 at 21:24
2

It is possible to serve many WebRTC connections using a single listening UDP/TCP port. You demux using the ICE ufrag and pwd and then route using the remote 3-tuple.

I am not sure what WebRTC implementation you are using, but with Pion you can enable this with SetICEUDPMux the actual code is in pion/ice. If we have seen this address before we demux it, otherwise we try to look up the ICE values.

I also have seen people go down the TURN route, I would recommend against it though. When you add one more piece that can drop traffic under heavy load it can be frustrating debugging.

Sean DuBois
  • 3,972
  • 1
  • 11
  • 22
  • The game server is written in C++ & I'm using [libdatachannel](https://github.com/paullouisageneau/libdatachannel) which unfortunately doesn't currently support this type of demuxing on a single socket - otherwise that's absolutely the route I'd have gone. I've actually got something working now which involves running a TURN server in a second container in the Kubernetes Pod (I'll follow up with an answer). As it's internal in the same pod the additional introduced latency should be minimal. I appreciate your comment to it being another point of failure though... – dbotha Jul 12 '21 at 16:38
  • 1
    I think you could still do this demuxing! You could run a 'proxy' on the same host on a dedicated port, and then use that to re-direct your traffic. -- It would end up just being a slightly lighter TURN server though, probably not worth the maintenance effort :/ but the diagram you shared re: TURN will work and that is what exactly others are doing! – Sean DuBois Jul 12 '21 at 18:05
  • 2
    Author of libdatachannel here. Indeed, serving multiple WebRTC peer connections on a single UDP socket is not supported for now as it would need to be implemented in one of the ICE libraries (libjuice, I guess). It would be a handy feature but note it does not follow the ICE agent specification and would need to be restricted to a server usage (it kind of breaks TURN allocations and prevents opening multiple connections between single-ports agents). libdatachannel allows to specify a port range for WebRTC, which could also help here, but I confirm a TURN server is still a good option. – Paul-Louis Ageneau Jul 12 '21 at 18:59
  • @SeanDuBois This was exactly the path I was originally going down, I was discussing it with Paul-Louis [here](https://github.com/paullouisageneau/libdatachannel/discussions/451#discussion-3455343). I started reading the TURN RFC in preparation then figured I'd try a TURN server in the same Pod - to my pleasant surprise it worked. Yes the maintenance effort vs this puts me off now but seemed like it would have been a neat little solution. I'm accepting Paul-Louis' answer over yours only because it's the only solution in my case as I can't listen on a single UDP/TCP port. Thanks for your help! – dbotha Jul 12 '21 at 21:19
  • 1
    @dbotha For information, libdatachannel now also supports multiplexing multiple WebRTC peer connections on a single UDP socket with the `enableIceUdpMux` option. – Paul-Louis Ageneau Mar 10 '22 at 10:10
0

For anyone still looking for a solution to this problem: STUNner is a new WebRTC media gateway that is designed precisely to support the use case the OP seeks, that is, ingesting WebRTC media traffic into a Kubernetes cluster. Note that STUNner itself is a TURN server but, being deployed into the same Kubernetes cluster as the game servers, it will add no visible delay to the game server traffic.

Disclaimer: I'm one of the authors of STUNner.

Gabor Retvari
  • 116
  • 1
  • 6