9

I am having a simple doubt.. I am trying to join three parts of a string using urljoin..

   host = "http://foo.com:port"
   ver = "/v1"
   exten = "/path"

Rather than doing host+ver+exten, I want to use urljoin to generate url But urljoin is giving http://foo.com:poort/v1 ( if i try urljoin(host,ver,exten) )

frazman
  • 32,081
  • 75
  • 184
  • 269
  • The [docs](https://docs.python.org/2/library/urlparse.html#urlparse.urljoin) seem like that's not how urljoin works. Maybe try [urlunsplit](https://docs.python.org/2/library/urlparse.html#urlparse.urlunsplit) – Yep_It's_Me Jul 17 '14 at 22:42
  • @Yep_It's_Me: Not sure, i understand.. can you give an example.. urlunsplit will split the url?? Right? – frazman Jul 17 '14 at 22:48
  • Sorry. Urlunsplit doesn't do quite what I thought it does. It will only join tuple of type `urlparse.SplitResult'. My bad. – Yep_It's_Me Jul 17 '14 at 23:00

4 Answers4

2

The way urljoin works is by combining a base URL and another URL. You could try joining the relative paths together with simple string combinations, and then use urljoin to join the host and the combined relative path.

Like:

rel = ver + exten
url = urljoin(host, rel)

Sadly if you want to combine multiple URL paths, you will have to use another library. If you're using a non-Windows machine, you could use the os.path module to join them together just like you would combine a local file path.

Philip Massey
  • 1,401
  • 3
  • 14
  • 24
2

Here's one way to do it on linux(python 2.x):

import urlparse
import os
def url_join(host, version, *additional_path):
    return urlparse.urljoin(host, os.path.join(version, *additional_path))

and then call this function like:

>> url_join("http://example.com:port", "v1", "path1", "path2", "path3")
>> 'http://example.com:port/v1/path1/path2/path3
AnukuL
  • 595
  • 1
  • 7
  • 21
  • 11
    os.path.join doesn't use the right separator in Win machines. --> ' ```'http://foo.com:port/v1\\path1\\path2\\path3'``` – Jakob Mar 22 '17 at 05:56
  • 2
    Us should use `import posixpath`. When you import `os.path` on any nix* you will get `posixpath` and on any windows you will get `ntpath`. – Matei May 02 '17 at 15:27
1

You can also join your list of parts recursively:

def urljoin(parts):
    if len(parts) > 1:
        parts = [urllib.parse.urljoin(parts[0], parts[1])] + parts[2:]
        return urljoin(parts)

    return parts[0]

Use the function like this:

parts = [
    'https://stackoverflow.com/',
    'questions/24814657/',
    'how-to-do-url-join-in-python-using-multiple-parameters/',
    '41756140#41756140',
]

print(urljoin(parts))
# https://stackoverflow.com/questions/24814657/how-to-do-url-join-in-python-using-multiple-parameters/41756140#41756140

Note that urllib.parse.urljoin() has a bit different behavior than os.path.join() mentioned by @AnukuL.

Jeyekomon
  • 2,878
  • 2
  • 27
  • 37
0

You could use the str.join function, as suggested in this other answer:

url = urljoin('http://example.com:port', '/'.join(['v1','path']))

If the path segments contains one or more slash /, use str.strip first:

path='/'.join(map(lambda s: s.strip('/'), ["/v1", "/path"]))
url = urljoin('http://example.com:port', path)
lorsanta
  • 118
  • 1
  • 5