Parsing a URL using regular expression

Question

I am trying to parse the 'Meghan' part from the line:

link = http://python-data.dr-chuck.net/known_by_Meghan.html

...with the following regex:

print re.findall('by_(\S+).html$',link)

I am getting the output:

[u'Meghan']

Why I am getting the 'u'?

score 0 · Answer 1 · answered May 07 '16 at 11:03

0

It means unicode. Depending on what you'll do with it, you can ignore it for the most part, of you can convert it to ascii by doing .encode('ascii')

answered May 07 '16 at 11:03

yelsayed

5,236
3
27
38

If this helped you, please upvote and/or the answer. :) – yelsayed May 07 '16 at 20:09

Parsing a URL using regular expression

1 Answers1