How can I check if a URL is absolute using Python?

Question

What is the preferred solution for checking if an URL is relative or absolute?

score 71 · Accepted Answer · edited Mar 02 '17 at 21:34

71

Python 2

You can use the urlparse module to parse an URL and then you can check if it's relative or absolute by checking whether it has the host name set.

>>> import urlparse
>>> def is_absolute(url):
...     return bool(urlparse.urlparse(url).netloc)
... 
>>> is_absolute('http://www.example.com/some/path')
True
>>> is_absolute('//www.example.com/some/path')
True
>>> is_absolute('/some/path')
False

Python 3

urlparse has been moved to urllib.parse, so use the following:

from urllib.parse import urlparse

def is_absolute(url):
    return bool(urlparse(url).netloc)

edited Mar 02 '17 at 21:34

Kukanani

718
1
6
22

answered Dec 02 '11 at 14:05

Lukáš Lalinský

40,587
6
104
126

4

Shouldn't `www.example.com/some/path` count as abolute too? – Geo Dec 02 '11 at 14:21
5

Officially, that's an relative URL with the whole string as path. If you want it to count as absolute, you would have to either add the `http://` by some pre-processing or not use `urlparse`. – Lukáš Lalinský Dec 02 '11 at 14:26
4

According to RFC `//google.com` is a protocol-relative url. And your code will return `False` for it. – Nik Feb 13 '14 at 15:07
I'd prefer `urlsplit` instead of `urlparse`. BTW, in Django you have a Python 2 & 3 compatible way: `from django.utils.six.moves.urllib.parse import urlsplit, urlparse` – Rockallite Aug 21 '17 at 07:19
If you want Python 2 & 3 compatibility just use six module (`six.moves.urllib.parse`) -> https://pythonhosted.org/six/#module-six.moves.urllib.parse – mateuszb Sep 24 '17 at 10:23
1

@Nik not for me: In [27]: urlparse('//google.com') Out[27]: ParseResult(scheme='', netloc='google.com', path='', params='', query='', fragment='') – Sean Aug 06 '21 at 11:41

score 29 · Answer 2 · edited Mar 16 '21 at 07:43

29

If you want to know if an URL is absolute or relative in order to join it with a base URL, I usually do urllib.parse.urljoin anyway:

>>> from urllib.parse import urljoin
>>> urljoin('http://example.com/', 'http://example.com/picture.png')
'http://example.com/picture.png'
>>> urljoin('http://example1.com/', '/picture.png')
'http://example1.com/picture.png'
>>>

edited Mar 16 '21 at 07:43

Bob Whitelock

167
3
12

answered Dec 02 '11 at 13:45

warvariuc

57,116
41
173
227

4

It turns out that this is what I wanted to do - it treats the first URL as the default for all unspecified parts of the second URL. If the second one is absolute, it just uses that one. – rescdsk Oct 21 '13 at 15:22
2

Anyone using this should be aware that if given `http://www.yahoo.com` and `www.google.com` as inputs, this will give you `http://www.yahoo.com/www.google.com` as output, which probably isn't what you wanted. So you'll still have to check somehow if the second one is a url without a schema, or if actually a relative path. – J. Taylor Feb 02 '19 at 06:36

score 2 · Answer 3 · answered Jan 23 '14 at 02:40

2

Can't comment accepted answer, so write this comment as new answer: IMO checking scheme in accepted answer ( bool(urlparse.urlparse(url).scheme) ) is not really good idea because of http://example.com/file.jpg, https://example.com/file.jpg and //example.com/file.jpg are absolute urls but in last case we get scheme = ''

I use this code:

is_absolute = True if '//' in my_url else False

answered Jan 23 '14 at 02:40

Alexander Ovchinnikov

379
1
5
10

2

AFAIK //foo/bar is a valid relative URL. With "relative" meaning "without scheme and netloc". – guettli Dec 18 '17 at 10:39

How can I check if a URL is absolute using Python?

3 Answers3

Python 2

Python 3

Linked