why does WebRTC require both browsers to generate connection info?

Question

So I am looking into building a game using WebRTC, mostly just to learn how to use WebRTC more than anything. What I envisioned in my head was one browser (lets call it Alice) wants to start a game. They figure out their connection information and then send that info to another browser (Bob) who they want to join their game. I like the idea of a link similar to a discord invite.

What I had imagined, was that this was all that was required. Bob's browser knows where Alice is, and Alice is expecting a connection from someone who knows about their connection information (their SDP). Instead what is required is that Bob needs to generate his own connection information (his SDP) and then hand that back to Alice somehow. (For reference, here is an implementation of a "serverless" WebRTC client, which requires both parties to pass their connection info to the other person https://github.com/lesmana/webrtc-without-signaling-server)

Because there are two required messages, telling users to do this manually is very much a pain, and gets increasingly difficult with more users (e.g. Alice, Bob and Charlie want to connect). For this reason we have "signaling servers" which handle this handshaking.

My question is why is all of this necessary? Is it for security? Couldnt you consider a browser secure enough if their SDP info included a generated hash that only those they expect (like Bob) have access to?

score 0 · Answer 1 · answered Feb 11 '21 at 15:12

Don't confuse connection info (ice candidates) with SDP. What are ICE Candidates and how do the peer connection choose between them?

If you are asking specifically about web browsers - then yes, you have to collect connection info, nothing to do with SDP, from each browser. This is because browsers do not listen on a specific, well known port, which is open in firewalls too. So it's not like one browser could just connect to another one, using well-known endpoint (IP:Port). The idea is that Stun server will drill a hole in both firewalls and thus will make direct connection between browsers possible. Read STUN spec to see how this is done.

However, if one peer is a browser, and another peer is your own application that listens on specific port (WebRTC gateways, media servers), then you don't need to collect connection info (ice candidates) from the browser. Nobody needs it. Stun/Turn servers are not involved. Browser always connects to your application. You can hardcode ice candidate in your webpage, which will contain the endpoint exposed by your application.

You always have to exchange SDPs between two peers, because they carry codecs information and other info about media stream, that another peer needs to know about. Browsers need to agree that they can decode the incoming stream, for example.

what if I plan to only use a data channel? Could I then hardcode the ports and generate the SDP from the offering web browser? I think in my specific case the _only_ surefire unknown is the IP address of the answering web browser — andykais, Feb 11 '21 at 21:28
Data channel is no different in terms of connectivity - browsers still need to be connected through NAT firewalls. So - no, you cannot. — user1390208, Feb 11 '21 at 22:21
there are more unknowns besides the port, in particular the ice-ufrag, ice-pwd and the DTLS fingerprint. All of these relate to security. — Philipp Hancke, Feb 12 '21 at 14:19
got it. I did a little more research, and are NAT firewalls only relevant some of the time? As I understand it, thats what TURN servers are for. Theyre a way to send your traffic through a middleman to avoid firewall constraints. Do NAT firewalls matter when just a STUN server is needed? — andykais, Feb 12 '21 at 20:17

why does WebRTC require both browsers to generate connection info?

1 Answers1