1

Currently there seems to be lots of confusion due to a variety of pro-privacy features introduced in latest RFCs aiming at hiding away peers' IPs but also due to the logic responsible for optimizing redundant information produced through WebRTC client APIs.

So the question is simple: should a client which is NOT behind a NAT produce a local SRFLX candidate when communicating with a STUN server? or would it be optimized away since the same public IP was already found to be present within of a HOST candidate ?

Notice that I do get the public IP within a HOST candidate which just confirms that the computer in question is not behind a NAT.

The closest thread I found matching this situation is over here WebRTC does not work in the modern browsers when a peer is not behind a NAT because of the obfuscating host address

Over there, it is considered a bug in chromium which supposedly was fixed, but then I am facing a situation with the most recent build where SRFLX candidates are not being generated and there is no obstruction at the UDP layer. Once behind a NAT I do receive a SRFLX candidate.

Testing with Trickle ICE test page. Running Chrome 112.0.5615.50 on Windows 11.

I decided to dive deep into this problem and I'm facing very peculiar situation.

I mean there are two popular test-pages in the wild.

Now, to all my surprise, I am not getting coherent results with the two test-pages.

2 produces local SRFLX candidates right away while 1 does not produce SRFLX but produces lots of HOST candidates instead.

Now the question is how come.

Both these pages are to be considered state of the art. These are running in separate tabs, on same computer, in same browser which is broadly open to the Internet.

So, I fired up Chromium's Web-RTC internals.

For the life of me I cannot see differences to invocations of APIs - as these are reported.

All I can see is that 2 is faced with the following onicecandidateerror almost right away:

url: stun:stun.l.google.com:19302
address: [0:0:0:x:x:x:x:x]
port: 57752
host_candidate: [0:0:0:x:x:x:x:x]:57752
error_text: STUN host lookup received error.
error_code: 701

Notice that the error does not prevent the Chromium's client side API engine from generating a SRFLX candidate.

while 1 is not facing any errors for >10 seconds after which it is faced with:

url: stun:stun.l.google.com:19302
address: 169.254.50.x
port: 52966
host_candidate: 169.254.50.x:52966
error_text: STUN binding request timed out.
error_code: 701

I include dumps from data-exchange with both 1 and 2 I cannot find any discrepancies between API invocations between the two. Still looking..

Here go events for 2: enter image description here

which is followed by ICE candidates and the mentioned error: enter image description here

Now for the 1 (Trickle ICE test-page):

enter image description here

enter image description here

Kindly do notice the discrepancies mainly in the two entirely distinct types of (>700 - meaning client side) errors, and the timing at which these occur (same browser, same computer, same Google's STUN server, same time) and the magnitude of possible connectivity implications.

Update: I have tripple checked all the lines reported by Chromium's WebRTC Internal dev pane. I could not spot ANY discrepancies in terms of invoked APIs and / or parameters passed to these. After all - it's all about creating a local offer and setting local description. Yet still, I cannot spot any differences in parameters passed into these functions (including internal reported callbacks) all both create-offers are only after audio and that's it. No exotics explicit parameters set. Defaults.

The following createOfferOnSuccess() invocations seem to contain IDENTICAL data (only but for sequence numbers and ports). Same holds for data passed into setLocalDescription(). Even the transceiver gets modified in the very same exact way. On both 1 and 2.

And yet, the following errors and resulting ICE candidates differ by a lot. Now I am extremely curious why that might be.

Why would 2 be immediately faced with "STUN host lookup received error." even though the URL is perfectly fine (as reported by WebRTC Internals on screenshots above) and resolved well in the other tab? Why would it manage to produce SRFLX ICE candidates while the other tab keeps generating HOST candidates containing public IPs and never coming up with the actual SRFLX candidate which the remote peer surely expects?

I went after looking into the details and differences in between of the generated ICE candidates.

For 1, there is no SRFLX local candidate BUT still there is the following HOST candidate: enter image description here

So it turns out, the STUN server has no trouble communicating with local node through UDP. Fingers crossed. Then why no SRFLX? Then what is the point of having SRFLX datagram type at all? If the remote peer can possibly cope with a HOST candidate alone, can it?

Let us now take a look at the SRFLX candidate produced by 2: enter image description here

Same Public IP address detected, similar port.

I've also checked the initial parameters passed to create-connection:

{ iceServers: [stun:stun.l.google.com:19302], iceTransportPolicy: all, bundlePolicy: balanced, rtcpMuxPolicy: require, iceCandidatePoolSize: 0 }

Same in both cases.

I've went into lengths of checking whether my system assigns different Integrity Labels (MS security construct) to these tabs. Nope - both are marked as untrusted.

That's the code behind 1:

 async function start() {
  // Clean out the table.
  while (candidateTBody.firstChild) {
    candidateTBody.removeChild(candidateTBody.firstChild);
  }

  gatherButton.disabled = true;
  if (getUserMediaInput.checked) {
    stream = await navigator.mediaDevices.getUserMedia({audio: true});
  }
  getUserMediaInput.disabled = true;

  // Read the values from the input boxes.
  const iceServers = [];
  for (let i = 0; i < servers.length; ++i) {
    iceServers.push(JSON.parse(servers[i].value));
  }
  const transports = document.getElementsByName('transports');
  let iceTransports;
  for (let i = 0; i < transports.length; ++i) {
    if (transports[i].checked) {
      iceTransports = transports[i].value;
      break;
    }
  }

  // Create a PeerConnection with no streams, but force a m=audio line.
  const config = {
    iceServers: iceServers,
    iceTransportPolicy: iceTransports,
  };

  const offerOptions = {offerToReceiveAudio: 1};
  // Whether we gather IPv6 candidates.
  // Whether we only gather a single set of candidates for RTP and RTCP.

  console.log(`Creating new PeerConnection with config=${JSON.stringify(config)}`);
  const errDiv = document.getElementById('error');
  errDiv.innerText = '';
  let desc;
  try {
    pc = new RTCPeerConnection(config);
    pc.onicecandidate = iceCallback;
    pc.onicegatheringstatechange = gatheringStateChange;
    pc.onicecandidateerror = iceCandidateError;
    if (stream) {
      stream.getTracks().forEach(track => pc.addTrack(track, stream));
    }
    desc = await pc.createOffer(offerOptions);
  } catch (err) {
    errDiv.innerText = `Error creating offer: ${err}`;
    gatherButton.disabled = false;
    return;
  }
  begin = window.performance.now();
  candidates = [];
  pc.setLocalDescription(desc);
}

And That's the code behind 2:

start() {
    this.startTime = Date.now()
    this.pc = new RTCPeerConnection({iceServers: this.iceServers})

    this.pc.onicegatheringstatechange = (e) => {
      if (this.pc.iceGatheringState === 'complete') {
        this.endTime = Date.now()
      }
      this.emit('icegatheringstatechange', this.pc.iceGatheringState)
    }

    this.pc.onicecandidate = (event) => {
      if (event.candidate) {
        this.candidates.push({time: Date.now(), candidate: event.candidate})
        this.emit('icecandidate', event.candidate)
      }
    }

    this.pc.createOffer({offerToReceiveAudio: true}).then((desc) => {
      return this.pc.setLocalDescription(desc)
    }).catch((error) => {
      console.error(error)
      reject(error)
    })
  }


  stop() {
    this.pc.onicegatheringstatechange = null
    this.pc.onicecandidate = null
    this.pc.close()
    this.pc = undefined
    this.candidates = undefined
  }

So all in all this boils down to a couple of things:

  • above all - what causes the discrepancies?
  • which tab is in a better situation? In the end, had we been passing the generated ICE candidates over a signaling channel to the other peer - which does not exist in our case, we would have one application 1 sending over lots of host candidates (to the other peer) while the other app 2 would be dispatching the SRFLX ice candidate. Which one is in a better position to maintain a successful connection? Personally I thought SRFLX is required to assume a STUN service as operational to begin with. HOST candidates are good for same-network communication. That's the assumption we rely on.
  • anyhow, our code-base ends up in the very same spot the Trickle ICE test-page does. We do Perfect Negotiation paradigm, in accordance to latest RFCs and or/suggestions - and do we do not get a SRFLX local candidate in the aforementioned networking configuration. When interacting with any Google STUN server.
  • why does 2 get a SRFLX local candidate? Then, why is it faced with what seems as a DNS-related error - at the head-start?
  • if local RTC sub-system can generate HOST candidates with a public IP address of a node then what's the point for having a dedicated SRFLX ICE candidate?

UPDATE: Well folks. I've spent considerable amount of time digging deep into this. I went into the lengths of replicating or LIVE-altering code executed in the scope of 1 to mimic code executed in scope of domain 2 and vice-versa (live-altering code in Google Dev Console). Guess what? The results were not coherent. First I noticed that 2 was not passing credentials when quiring for STUN service and thought to myself BINGO! (why would one require credentials just to query for an IP.. besides I recall there used to be a bug around this years ago..) But I was wrong. No luck. I deliberately deleted

delete iceServers[0].username;
delete iceServers[0].credential;

these parameters from the final configuration passed into RTCPeerConnection. And still one implementation keeps receiving SRFLX candidates while the other is stuck with HOST candidates only.

Seeing this I begin to run into 'crazy' ideas such as Chrome treating different domains differently, because come on, what are the other options? There are just a few lines of code involved!

Here's this to be seen on YouTube: https://youtu.be/6NHiFVhiQtk

Vega4
  • 969
  • 1
  • 11
  • 25
  • the reason you are seeing differences might be because you have existing getUserMedia permissions for the samples (some of which access the camera) while the icetest page does not ask for the camera. This changes the behavior of the browsers ICE stack. – Philipp Hancke Apr 07 '23 at 07:31
  • @PhilippHancke and you were RIGHT - this had to do with permissions assigned to a particular tab. Notice that icetest also has the option to query for permissions. Now, in my case, I had mic/cam permissions for icetest already granted. Which is when I would keep having lots of HOST ice candidates and no SRFLX candidate. Once I revoked these permissions - I got the very same results as on the other page (SRFLX + a single HOST candidate). Now I would be extremely interested to hear more about whether this is expected and if it is - then why. Different kinds of (DNS) errors.. doesn't feel right. – Vega4 Apr 07 '23 at 07:50
  • Foremostly I am interested to know whether this would affect a handshake and formation of data streams (the lack of SRFLX being passed to the other peer). Why I am so interested? Because we form peer-to-peer connections based on dummy audio-video streams. Initially we do not request mic/cam permissions until user deliberately decides to use the corresponding media. In the face of lacking SRFLX would the connection formation fallback to a TURN server? Would HOST candidates containing public IPs be enough? Our would the connection formation fail entirely. – Vega4 Apr 07 '23 at 08:10
  • Now notice - with permissions granted there is no SRFLX candidate to begin with. There are HOST candidates some of which are with public IPs (UDP protocol) instead. Is THAT expected? Wasn't there supposed to be a SRFLX candidate in all cases? – Vega4 Apr 07 '23 at 08:10
  • Now.. there seems to be more to that. On the test-page of our own https://test.gridnet.org - we do not get SRFLX candidates whether we retract camera permissions or not. The code holds same. – Vega4 Apr 07 '23 at 08:33
  • @Vega4 I watched https://youtu.be/6NHiFVhiQtk. According to https://datatracker.ietf.org/doc/html/draft-ietf-mmusic-mdns-ice-candidates-00#section-3.1.2.1, srflx candidate is not considered redundant if the host address is obfuscated. – Gorisanson Apr 07 '23 at 09:45
  • @Vega4 In the test on https://icetest.info/, it is so. But, in the test on https://webrtc.github.io/samples/src/content/peerconnection/trickle-ice/, the public host address is ***not*** obfuscated and the srflx candidate is considered redundant and discarded. So I think the behaviors on [the video]( youtu.be/6NHiFVhiQtk) do not violate the WebRTC specification. – Gorisanson Apr 07 '23 at 09:51
  • @Gorisanson would you be so kind as to elaborate a bit more on what is considered as 'address obfuscation' ? We have already noticed, that as advised by PhilippHancke - revoking privileges to a given tab - changes the behaviour- all other things held constant. – Vega4 Apr 07 '23 at 09:54
  • @Vega4 Now I wonder why the test on webrtc.github.io/samples/src/content/peerconnection/trickle-ice does not obfuscate the host address and why it shows multiple (three) addresses while the other test on https://icetest.info/ obfuscate the host address and only shows one host address. Do you know the reason? – Gorisanson Apr 07 '23 at 09:55
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/253023/discussion-between-gorisanson-and-vega4). – Gorisanson Apr 07 '23 at 09:58
  • @Vega4 Oh, maybe this is the reason why the host address is not obfuscated in the first test: https://groups.google.com/g/discuss-webrtc/c/6stQXi72BEU. Qingsi Wang says on the first post "When the feature is active, private IP addresses in ICE host candidates will be replaced by an mDNS hostname, e.g., 1f4712db-ea17-4bcf-a596-105139dfd8bf.local. Currently, this feature is active for all sites except those that have getUserMedia permissions, which are presumed to have a higher degree of user trust." – Gorisanson Apr 07 '23 at 10:15

2 Answers2

1

I would say "no" since it does not provide additional information. The spec guidance for this is a bit vague though, https://www.rfc-editor.org/rfc/rfc8445#section-5.1.1.2 only talks about relay candidates ("If a relayed candidate is identical to a host candidate...")

Philipp Hancke
  • 15,855
  • 2
  • 23
  • 31
  • I've updated my inquiry. Notice that I'm getting inconsistent results with two popular test-beds on same computer in same browser and I cannot see differences in terms of API invocations reported by WebRTC Internals. – Vega4 Apr 06 '23 at 06:27
  • Updated yet again. I'm out of ideas. The ideas I came up with seem terrifyingly out-of-the-box. Would be thankful for anyone else to validate. – Vega4 Apr 07 '23 at 07:25
1

To answer the general question:

The omission of a server-reflexive candidate is expected when the candidate is redundant with a host candidate.

Reading both the ICE RFC and the mdns obfuscation draft:

https://www.rfc-editor.org/rfc/rfc8445.html#section-5.1.3

5.1.3. Eliminating Redundant Candidates

Next, the ICE agents (initiating and responding) eliminate redundant candidates. Two candidates can have the same transport address yet different bases, and these would not be considered redundant. Frequently, a server-reflexive candidate and a host candidate will be redundant when the agent is not behind a NAT. A candidate is redundant if and only if its transport address and base equal those of another candidate. The agent SHOULD eliminate the redundant candidate with the lower priority.

https://datatracker.ietf.org/doc/html/draft-ietf-mmusic-mdns-ice-candidates-03#section-3.1.2.2 (expired with no newer draft?)

Regardless of whether the address turns out to be public or private, a server-reflexive candidate will be generated; the transport address of this candidate will be an IP address and therefore distinct from the hostname transport address of the associated mDNS candidate, and as such MUST NOT be considered redundant per the guidance in [RFC8445], Section 5.1.3. To avoid accidental IP address disclosure, this server-reflexive candidate MUST have its raddr field set to "0.0.0.0"/"::" and its rport field set to "9", as discussed in [ICESDP], Section 9.1.

If, given a server reflexive candidate and a host candidate:

  1. Both candidates have the same transport address
  2. Both candidates have the same base
  3. The host candidate is not obfuscated

Then, the two candidates are considered mutually redundant, and the candidate between the two with a lower priority should be eliminated. The RFC recommends calculating priority with the following formula:

priority = (2^24)(type preference) + (2^8)(local preference) + (2^0)*(256 - component ID)

Further noting:

The type preference MUST be an integer from 0 (lowest preference) to 126 (highest preference) inclusive, MUST be identical for all candidates of the same type, and MUST be different for candidates of different types. The type preference for peer-reflexive candidates MUST be higher than that of server-reflexive candidates.

The RECOMMENDED values for type preferences are 126 for host candidates, 110 for peer-reflexive candidates, 100 for server-reflexive candidates, and 0 for relayed candidates.

In practice, this would mean that redundancy between a host candidate and a server reflexive candidate would normally expect to be resolved in favour of the host candidate. From my experience, this would mean I would expect to see the STUN packets show up on the wire, but then have the browser neither report the discarded candidate in its ICE summaries, nor report it to the remote peer.

shroudednight
  • 605
  • 4
  • 16