1

Why does python present the url (instead of base combined with url) when a url starts with mailto?

This is what happened:

from urllib.parse import urljoin 

>>> urljoin('http://www.w3.org/Consortium/mission.html', 'mailto:site-comments@w3.org')
   'mailto:site-comments@w3.org'

but I expected the result to be:

   'http://www.w3.org/Consortium/mailto:site-comments@w3.org'

Since:

>>> urljoin('http://www.w3.org/Consortium/mission.html', 'thing')
    'http://http://www.w3.org/Consortium/thing'

(Also see: Python: confusions with urljoin)

At first I thought the mailto is present in the result, because mailto is an absolute URL.. But mailto doesn't start with // or scheme://, so it isn't an absolute URL.

Note: If url is an absolute URL (that is, starting with // or scheme://), the url‘s host name and/or scheme will be present in the result.

See: https://docs.python.org/3.0/library/urllib.parse.html

So, if 'mailto:' isn't an absolute URL, why is 'mailto:' the resulting url? It is the behavior I want, but I just don't understand why it happens.

Hibisceae
  • 33
  • 2
  • 8
  • `urljoin('http://www.w3.org/Consortium/mission.html', './mailto:site-comments@w3.org')` will result you: `'http://www.w3.org/Consortium/mailto:site-comments@w3.org'` – floydya Aug 27 '18 at 11:21
  • @floydya Thanks, I know, but why doesn't urljoin('http://www.w3.org/Consortium/mission.html', 'mailto:site-comments@w3.org') result in 'http://www.w3.org/Consortium/mailto:site-comments@w3.org'? – Hibisceae Aug 27 '18 at 11:26
  • Most likely the current behaviour is intended to safeguard against joining non-hierarchical schemes in which case you would get nonsensical values. – floydya Aug 27 '18 at 11:29
  • That makes sense! So: When trying to join different scheme's, the url instead of the base will be present in the result. HTTP and mailto are different scheme's, so the result is the url, in this case 'mailto:site-comments@w3.org' – Hibisceae Aug 27 '18 at 11:52

0 Answers0