2

I want to get the query name and values to be displayed from a URL. For example, url='http://host:port_num/file/path/file1.html?query1=value1&query2=value2'

From this, parse the query names and its values and to print it.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Myjab
  • 924
  • 1
  • 7
  • 22
  • The canonical question is *[How can I split a URL string up into separate parts in Python?](https://stackoverflow.com/questions/449775/)* (2009). – Peter Mortensen Nov 28 '22 at 02:36

2 Answers2

8

Don't use a regex! Use urlparse.

>>> import urlparse
>>> urlparse.parse_qs(urlparse.urlparse(url).query)
{'query2': ['value2'], 'query1': ['value1']}
Afrowave
  • 911
  • 9
  • 21
teukkam
  • 4,267
  • 1
  • 26
  • 35
3

I agree that it's best not to use a regular expression and better to use urlparse, but here is my regular expression.

Classes like urlparse were developed specifically to handle all URLs efficiently and are much more reliable than a regular expression is, so make use of them if you can.

>>> x = 'http://www.example.com:8080/abcd/dir/file1.html?query1=value1&query2=value2'
>>> query_pattern='(query\d+)=(\w+)'
>>> # query_pattern='(\w+)=(\w+)'    a more general pattern
>>> re.findall(query_pattern, x)
[('query1', 'value1'), ('query2', 'value2')]
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
jamylak
  • 128,818
  • 30
  • 231
  • 230
  • It might be worth elaborating on why regex is the wrong hammer for this nail. – Li-aung Yip Apr 04 '12 at 11:07
  • 1
    Alright I think I explained it very briefly. Feel free to explain it better if you want :D – jamylak Apr 04 '12 at 11:09
  • thank you jamylak. can you plase tel me how can v split it generically. for eg. if the query contains "name=asd&name1=qwerty", for this the above pattern ll not work. so instead of using name wat v can use in the query pattern. Since i'm new to python regex i'm asking this :) – Myjab Apr 09 '12 at 09:44
  • See the commented out code, `query_pattern='(\w+)=(\w*)'`. That should work for any query. – jamylak Apr 09 '12 at 09:46
  • oh sorry i haven't seen it sorry – Myjab Apr 09 '12 at 10:11
  • Actually it should be `query_pattern='(\w+)=(\w+)'`, i don't think it will matter from previous pattern but anyway. :D – jamylak Apr 09 '12 at 10:13
  • parsing urls with regular expressions is a terrible idea; when there's an urlparse module *in the stdlib* this is unforgivable. – hdgarrood Jul 18 '14 at 03:57
  • That won't work for any query; hyphen-minus is a valid character in query string, as are exclamation mark, comma... see http://www.ietf.org/rfc/rfc1738.txt – hdgarrood Jul 18 '14 at 04:00