How to get everything after last slash in a URL?

Question

How can I extract whatever follows the last slash in a URL in Python? For example, these URLs should return the following:

URL: http://www.test.com/TEST1
returns: TEST1

URL: http://www.test.com/page/TEST2
returns: TEST2

URL: http://www.test.com/page/page/12345
returns: 12345

I've tried urlparse, but that gives me the full path filename, such as page/page/12345.

If the URL might contain querystrings like `...?foo=bar` and you don't want this; I'd suggest use `urlparse` in combination with naeg's `basename`-suggestion. — plundra, Aug 31 '11 at 07:31
http://docs.python.org/library/urlparse.html#module-urlparse — Rusty Rob, Aug 31 '11 at 07:56
URLs can end with a slash. If you need `http://www.test.com/TEST1/` to return `TEST1` then all these answers aren't for you. — Boris Verkhovskiy, Oct 06 '20 at 01:19
I'm a little disappointed that no one used the url of this question in their example :~( — Josie Thompson, Aug 27 '21 at 04:02
@Boris: Not anymore - since your answer (and now also mine). ;-) — lcnittl, Dec 22 '21 at 10:11

score 340 · Accepted Answer · edited Dec 21 '15 at 04:02

340

You don't need fancy things, just see the string methods in the standard library and you can easily split your url between 'filename' part and the rest:

url.rsplit('/', 1)

So you can get the part you're interested in simply with:

url.rsplit('/', 1)[-1]

edited Dec 21 '15 at 04:02

Remi Guan

21,506
17
64
87

answered Aug 31 '11 at 07:28

Luke404

10,282
3
25
31

16

`url.rsplit('/', 1)` returns a list, and `url.rsplit('/', 1)[-1]` is the bit after the last slash. – Hugo Oct 13 '15 at 12:26
5

Another way to do would be: url.rsplit('/', 1).pop() – Alex Fortin Mar 02 '18 at 17:55
20

**WARNING:** This basic trick breaks completely on URLs such as `http://www.example.com/foo/?entry=the/bar#another/bar`. But basic parsing like `rsplit` is okay if you are absolutely certain there will never be any slashes in your query or fragment parameters. However, I shudder to think of how many codebases actually contain this `rsplit` code and its associated bug with query handling. **People who want ABSOLUTE SECURITY AND RELIABILITY should be using `urllib.parse()` instead! You can then use the `path` value that it returns and split THAT to ensure that you've split ONLY the path.** – Mitch McMabers May 31 '20 at 07:26
14

**CODE: An example of how to implement the better method:** `from urllib.parse import urlparse; p = urlparse("http://www.example.com/foo.htm?entry=the/bar#another/bar"); print(p.path.rsplit("/", 1)[-1])` Result: `foo.htm` – Mitch McMabers May 31 '20 at 07:37
@MitchMcMabers please turn this into an answer (which should then be the accepted one) – Caterpillaraoz Feb 19 '21 at 18:05
1

@Caterpillaraoz I count two non-accepted answers here that suggest exactly this for years now :) – tzot Sep 20 '21 at 08:51

score 94 · Answer 2 · answered Aug 31 '11 at 07:31

94

One more (idio(ma)tic) way:

URL.split("/")[-1]

answered Aug 31 '11 at 07:31

Kimvais

38,306
16
108
142

3

Yes this is more straightforward than using `rsplit`. – Jan Kyu Peblik Aug 06 '19 at 15:50
plus 1 for the funny comment haha – Jacky Supit Mar 16 '22 at 02:13

score 16 · Answer 3 · answered Aug 31 '11 at 07:28

16

rsplit should be up to the task:

In [1]: 'http://www.test.com/page/TEST2'.rsplit('/', 1)[1]
Out[1]: 'TEST2'

answered Aug 31 '11 at 07:28

Benjamin Wohlwend

30,958
11
90
100

score 13 · Answer 4 · answered Apr 04 '13 at 05:51

urlparse is fine to use if you want to (say, to get rid of any query string parameters).

import urllib.parse

urls = [
    'http://www.test.com/TEST1',
    'http://www.test.com/page/TEST2',
    'http://www.test.com/page/page/12345',
    'http://www.test.com/page/page/12345?abc=123'
]

for i in urls:
    url_parts = urllib.parse.urlparse(i)
    path_parts = url_parts[2].rpartition('/')
    print('URL: {}\nreturns: {}\n'.format(i, path_parts[2]))

Output:

URL: http://www.test.com/TEST1
returns: TEST1

URL: http://www.test.com/page/TEST2
returns: TEST2

URL: http://www.test.com/page/page/12345
returns: 12345

URL: http://www.test.com/page/page/12345?abc=123
returns: 12345

Using `urlparse` is the right answer, but this will return `""` if your url ends with a `/`. — Boris Verkhovskiy, Oct 06 '20 at 01:24
using `i.rstrip('/')` would solve the empty path when ending in / — neves, Nov 26 '21 at 21:29

score 13 · Answer 5 · edited Aug 27 '19 at 11:56

13

You can do like this:

head, tail = os.path.split(url)

Where tail will be your file name.

edited Aug 27 '19 at 11:56

Harsha Biyani

7,049
9
37
61

answered Sep 20 '13 at 13:53

neowinston

7,584
10
52
83

1

This won't work on systems where the path separator is not "/". One of the notes in the os.path [docs](https://docs.python.org/3/library/os.path.html) mentions a posixpath, but I couldn't import it on my system: "you can also import and use the individual modules if you want to manipulate a path that is always in one of the different formats. They all have the same interface: posixpath for UNIX-style paths" – aschmied Sep 24 '21 at 19:48

score 10 · Answer 6 · edited Mar 15 '20 at 02:31

10

os.path.basename(os.path.normpath('/folderA/folderB/folderC/folderD/'))

>>> folderD

edited Mar 15 '20 at 02:31

Stéphane Bruckert

21,706
14
92
130

answered Jan 15 '19 at 05:01

Rochan

1,412
1
14
17

1

this also works: ```from pathlib import Path print(f"Path(redirected_response.url).stem: {Path(redirected_response.url).stem!r}")``` – Alex Glukhovtsev Jun 25 '20 at 08:35
[URLs](https://tools.ietf.org/html/rfc3986#section-3) aren't file paths, they can contain a `?query=string` or a `#fragment` after the path. – Boris Verkhovskiy Nov 18 '20 at 22:07

score 5 · Answer 7 · answered Apr 12 '18 at 14:32

5

Here's a more general, regex way of doing this:

    re.sub(r'^.+/([^/]+)$', r'\1', url)

answered Apr 12 '18 at 14:32

sandoronodi

315
2
12

1

can you explain it a bit? – Revolucion for Monica Jan 13 '20 at 16:25
@sandoronodi. Thanks for your solution. If the url is embedded in a long string, then how can I keep the information after the last `/`? Thank you. – Sophia Jun 25 '23 at 21:40

Boris Verkhovskiy · Answer 8 · 2022-01-09T07:22:29.370

Use urlparse to get just the path and then split the path you get from it on / characters:

from urllib.parse import urlparse

my_url = "http://example.com/some/path/last?somequery=param"
last_path_fragment = urlparse(my_url).path.split('/')[-1]  # returns 'last'

Note: if your url ends with a / character, the above will return '' (i.e. the empty string). If you want to handle that case differently, you need to strip the last trailing / character before you split the path:

my_url = "http://example.com/last/"
# handle URL ending in `/` by removing it.
last_path_fragment = urlparse(my_url).path.rstrip('/', 1).split('/')[-1]  # returns 'last'

tzot · Answer 9 · 2020-10-07T13:36:19.513

4

First extract the path element from the URL:

from urllib.parse import urlparse
parsed= urlparse('https://www.dummy.example/this/is/PATH?q=/a/b&r=5#asx')

and then you can extract the last segment with string functions:

parsed.path.rpartition('/')[2]

(example resulting to 'PATH')

edited Oct 07 '20 at 13:36

answered Sep 19 '11 at 09:22

tzot

92,761
29
141
204

1

or we can use `parsed.path.rpartition('/')[-1]` to get the last segment – Franz Wong May 11 '22 at 04:14
1

`.partition` always returns a 3-element-tuple, so `[-1]` is `[2]`. – tzot May 12 '22 at 10:11

lcnittl · Answer 10 · 2021-12-22T10:17:53.120

The following solution, which uses pathlib to parse the path obtained from urllib.parse allows to get the last part even when a terminal slash is present:

import urllib.parse
from pathlib import Path

urls = [
    "http://www.test.invalid/demo",
    "http://www.test.invalid/parent/child",
    "http://www.test.invalid/terminal-slash/",
    "http://www.test.invalid/query-params?abc=123&works=yes",
    "http://www.test.invalid/fragment#70446893",
    "http://www.test.invalid/has/all/?abc=123&works=yes#70446893",
]

for url in urls:
    url_path = Path(urllib.parse.urlparse(url).path)
    last_part = url_path.name  # use .stem to cut file extensions
    print(f"{last_part=}")

yields:

last_part='demo'
last_part='child'
last_part='terminal-slash'
last_part='query-params'
last_part='fragment'
last_part='all'

score 0 · Answer 11 · answered May 19 '17 at 09:16

0

Split the url and pop the last element url.split('/').pop()

answered May 19 '17 at 09:16

Atul Yadav

1,992
1
13
15

score 0 · Answer 12 · answered Jun 10 '21 at 08:58

Split the URL and pop the last element

const plants = ['broccoli', 'cauliflower', 'cabbage', 'kale', 'tomato'];

console.log(plants.pop());
// expected output: "tomato"

console.log(plants);
// expected output: Array ["broccoli", "cauliflower", "cabbage", "kale"]

score 0 · Answer 13 · answered Aug 31 '11 at 07:28

0

extracted_url = url[url.rfind("/")+1:];

answered Aug 31 '11 at 07:28

fardjad

20,031
6
53
68

score -5 · Answer 14 · edited Feb 18 '13 at 22:09

-5

url ='http://www.test.com/page/TEST2'.split('/')[4]
print url

Output: TEST2.

edited Feb 18 '13 at 22:09

sigod

3,514
2
21
44

answered Feb 18 '13 at 21:42

live_alone

159
1
11

2

You really should pass `-1` as the index, otherwise this only works on strings with exactly that many `/` – Chris_Rands Sep 30 '16 at 07:24

How to get everything after last slash in a URL?

14 Answers14

Linked

Related