1

Get string between ampersand or at the end

I have the following possible URLs:

http://google.com/sadfasdfsd&AA=mytag&SS=sdfsdf
http://google.com/sadfasdfsd&AA=mytag

What is the best way in Python to get mytag from the string ~&AA=mytag&~? There are two possibilities: &AA= in between or &AA= at the end. Then how do I match these all with regular expressions?

This question is from: Python Get Tags from URL

>>> import re
>>> str = 'http://google.com/sadfasdfsd&AA=mytag&SS=sdfsdf'
>>> m = re.search(r'.*\&AA=([^&]*)\&.*', str)
>>> m.group(1)
'mytag'

But this only works when I have this type of URL:

http://google.com/sadfasdfsd&AA=mytag&SS=sdfsdf
Community
  • 1
  • 1
  • Those are some pretty fishy looking urls -- Usually there'd be a `?` before the ampersands start popping up ... – mgilson Jun 13 '14 at 22:23
  • So you want all parameters from url to be extracted if I understood correctly? Then you might want to look into this http://stackoverflow.com/questions/5074803/retrieving-parameter-from-url-in-python – Aamir Rind Jun 13 '14 at 22:26
  • If you want to use a regex, try `.*\&AA=([^&]*)\&*.*`. You're currently requiring the second `&`. – erbridge Jun 13 '14 at 22:28

1 Answers1

4

Use a URL parsing library.

>>> import urlparse
>>> url = urlparse.urlparse('http://google.com/sadfasdfsd?AA=mytag&SS=sdfsdf')
>>> url.query
'AA=mytag&SS=sdfsdf'
>>> urlparse.parse_qs(url.query)
{'AA': ['mytag'], 'SS': ['sdfsdf']}
mgilson
  • 300,191
  • 65
  • 633
  • 696
John Kugelman
  • 349,597
  • 67
  • 533
  • 578