What is the preferred solution for checking if an URL is relative or absolute?
Asked
Active
Viewed 1.9k times
3 Answers
71
Python 2
You can use the urlparse
module to parse an URL and then you can check if it's relative or absolute by checking whether it has the host name set.
>>> import urlparse
>>> def is_absolute(url):
... return bool(urlparse.urlparse(url).netloc)
...
>>> is_absolute('http://www.example.com/some/path')
True
>>> is_absolute('//www.example.com/some/path')
True
>>> is_absolute('/some/path')
False
Python 3
urlparse
has been moved to urllib.parse
, so use the following:
from urllib.parse import urlparse
def is_absolute(url):
return bool(urlparse(url).netloc)

Kukanani
- 718
- 1
- 6
- 22

Lukáš Lalinský
- 40,587
- 6
- 104
- 126
-
4Shouldn't `www.example.com/some/path` count as abolute too? – Geo Dec 02 '11 at 14:21
-
5Officially, that's an relative URL with the whole string as path. If you want it to count as absolute, you would have to either add the `http://` by some pre-processing or not use `urlparse`. – Lukáš Lalinský Dec 02 '11 at 14:26
-
4According to RFC `//google.com` is a protocol-relative url. And your code will return `False` for it. – Nik Feb 13 '14 at 15:07
-
I'd prefer `urlsplit` instead of `urlparse`. BTW, in Django you have a Python 2 & 3 compatible way: `from django.utils.six.moves.urllib.parse import urlsplit, urlparse` – Rockallite Aug 21 '17 at 07:19
-
If you want Python 2 & 3 compatibility just use six module (`six.moves.urllib.parse`) -> https://pythonhosted.org/six/#module-six.moves.urllib.parse – mateuszb Sep 24 '17 at 10:23
-
1@Nik not for me: In [27]: urlparse('//google.com') Out[27]: ParseResult(scheme='', netloc='google.com', path='', params='', query='', fragment='') – Sean Aug 06 '21 at 11:41
29
If you want to know if an URL is absolute or relative in order to join it with a base URL, I usually do urllib.parse.urljoin
anyway:
>>> from urllib.parse import urljoin
>>> urljoin('http://example.com/', 'http://example.com/picture.png')
'http://example.com/picture.png'
>>> urljoin('http://example1.com/', '/picture.png')
'http://example1.com/picture.png'
>>>

Bob Whitelock
- 167
- 3
- 12

warvariuc
- 57,116
- 41
- 173
- 227
-
4It turns out that this is what I wanted to do - it treats the first URL as the default for all unspecified parts of the second URL. If the second one is absolute, it just uses that one. – rescdsk Oct 21 '13 at 15:22
-
2Anyone using this should be aware that if given `http://www.yahoo.com` and `www.google.com` as inputs, this will give you `http://www.yahoo.com/www.google.com` as output, which probably isn't what you wanted. So you'll still have to check somehow if the second one is a url without a schema, or if actually a relative path. – J. Taylor Feb 02 '19 at 06:36
2
Can't comment accepted answer, so write this comment as new answer: IMO checking scheme in accepted answer ( bool(urlparse.urlparse(url).scheme)
) is not really good idea because of http://example.com/file.jpg, https://example.com/file.jpg and //example.com/file.jpg are absolute urls but in last case we get scheme = ''
I use this code:
is_absolute = True if '//' in my_url else False

Alexander Ovchinnikov
- 379
- 1
- 5
- 10
-
2AFAIK //foo/bar is a valid relative URL. With "relative" meaning "without scheme and netloc". – guettli Dec 18 '17 at 10:39