How do I remove a query string from URL using Python

Question

Example:

http://example.com/?a=text&q2=text2&q3=text3&q2=text4

After removing "q2", it will return:

http://example.com/?q=text&q3=text3

In this case, there were multiple "q2" and all have been removed.

score 84 · Answer 1 · edited Aug 05 '19 at 16:13

84

import sys

if sys.version_info.major == 3:
    from urllib.parse import urlencode, urlparse, urlunparse, parse_qs
else:
    from urllib import urlencode
    from urlparse import urlparse, urlunparse, parse_qs

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4&b#q2=keep_fragment'
u = urlparse(url)
query = parse_qs(u.query, keep_blank_values=True)
query.pop('q2', None)
u = u._replace(query=urlencode(query, True))
print(urlunparse(u))

Output:

http://example.com/?a=text&q3=text3&b=#q2=keep_fragment

edited Aug 05 '19 at 16:13

林果皞

7,539
3
55
70

answered Oct 12 '11 at 02:42

Miki Tebeka

13,428
4
37
49

4

Best answer. One addition, [geturl](https://docs.python.org/2/library/urlparse.html#urlparse.ParseResult.geturl) method of urlparse object can be used instead of urlunparse `print(u.geturl())` – tug Mar 29 '16 at 10:02
If the goal is to remove all query params, wouldn't `u = u._replace(query='')` also work? We could avoid an extra import this way. – kurifu Oct 12 '17 at 19:01
2

@kurifu The OP wanted to remove only one parameter, not the whole query. – Miki Tebeka Oct 14 '17 at 07:35
6

Python 3 imports: `from urllib.parse import urlencode, urlparse, urlunparse, parse_qs` – Vanni Totaro Nov 21 '17 at 11:44
2

access to a protected member _replace of a class.... how can we avoid this warning message? – IamMashed May 09 '19 at 02:09
1

@IamMashed This is how namedtuples work - https://docs.python.org/3/library/collections.html#collections.somenamedtuple._replace It's probably some kind of linter adding the warning. – Miki Tebeka May 10 '19 at 09:33

score 75 · Answer 2 · edited Dec 14 '19 at 16:25

75

To remove all query string parameters:

from urllib.parse import urljoin, urlparse

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
urljoin(url, urlparse(url).path)  # 'http://example.com/'

For Python2, replace the import with:

from urlparse import urljoin, urlparse

edited Dec 14 '19 at 16:25

Matthew D. Scholefield

2,977
3
31
42

answered Aug 25 '15 at 21:56

png

5,990
2
25
16

8

I like this approach a _bit_ better than the popular answer because it doesn't call any internal APIs, but it will also eliminate URL fragments, whereas the popular answer will preserve them. It also doesn't solve the OP's exact question (it deletes _all_ query string parameters), but it solves mine :) – Dolph Aug 10 '18 at 14:04
Was first looking into furl, but this removes the need to install another library. Works perferctly! – merlin Jun 07 '20 at 15:33
This should be the accepted answer. I came here twice after a few weeks since it is hard to remember and searched again for it. – merlin Dec 25 '20 at 14:24

score 23 · Answer 3 · answered Feb 06 '19 at 08:03

23

Isn't this just a matter of splitting a string on a character?

>>> url = http://example.com/?a=text&q2=text2&q3=text3&q2=text4
>>> url = url.split('?')[0]
'http://example.com/'

answered Feb 06 '19 at 08:03

Clarius

1,183
10
10

I was thinking about this solution as well. Can anyone tell me if there are any issue (potential bug/loophole) in this proposed solution? – Programer Beginner May 07 '19 at 00:00
1

@ProgramerBeginner There isn't one, really! – mevers303 Aug 29 '19 at 18:03
The problem will be clear if you carefully read the original question. The OP wanted to remove only one parameter, all the query parameters. – conradlee Dec 07 '22 at 10:34

score 11 · Answer 4 · answered Dec 02 '16 at 09:08

11

Using python's url manipulation library furl:

import furl
f = furl.furl("http://example.com/?a=text&q2=text2&q3=text3&q2=text4")
f.remove(['q2'])
print(f.url)

answered Dec 02 '16 at 09:08

Mayank Jaiswal

12,338
7
39
41

13

Calling it 'python's url manipulation library' makes it sound like it's included in the standard lib, which it isn't. – Mattwmaster58 Apr 23 '20 at 17:21

score 3 · Answer 5 · edited Sep 17 '18 at 09:09

3

query_string = "https://example.com/api/api.php?user=chris&auth=true"
url = query_string[:query_string.find('?', 0)]

edited Sep 17 '18 at 09:09

4b0

21,981
30
95
142

answered Sep 17 '18 at 09:08

XKCD

122
1
11

3

this does not exactly provide a solution for the given answer. please try improving your answer or deleting it. – Alexander Sep 17 '18 at 09:13
1

While this code **may** answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value. – Nic3500 Sep 17 '18 at 11:39

score 1 · Answer 6 · answered Nov 14 '18 at 07:57

Or simply put, just use url_query_cleaner() from w3lib.url

from w3lib.url import url_query_cleaner

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
url_query_cleaner(url, ('q2'), remove=True)

Output: http://example.com/?a=text&q3=text3

score -1 · Answer 7 · answered Aug 30 '21 at 10:01

-1

Or you could just use strip

>>> l='http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
>>> l.strip('&q2=text4')
'http://example.com/?a=text&q2=text2&q3=text3'
>>>

answered Aug 30 '21 at 10:01

Drew97

1

score -2 · Answer 8 · answered Oct 12 '11 at 02:49

-2

import re
q ="http://example.com/?a=text&q2=text2&q3=text3&q2=text4"
todelete="q2"
#Delete every query string matching the pattern
r = re.sub(r''+todelete+'=[a-zA-Z_0-9]*\&*',r'',q)
#Delete the possible trailing #
r = re.sub(r'&$',r'',r)

print r

answered Oct 12 '11 at 02:49

lc2817

3,722
16
40

How do I remove a query string from URL using Python

8 Answers8

Linked