URL Regex for Python

Question

I am trying to compare webpage URLs using regex. I am using the below method.

regex_url = r'https://www.website.com/books/\w{8}$'
is_read = re.match(regex_url, request.url) is not None
if not is_read:
     add_to_read(token)

Everything works well for the above regex. But there is a new URL pattern now which I cant seem to get the regex right.

The new URL pattern is

https://www.website.com/books/Ab7us83xI?varient=web

9 characters followed by a question mark and then the word 'varient' and then '=web'. Can anyone help me get the correct regex for this?

Only the first 9 characters change every time. Apologies if this is a stupid question.

Many thanks.

Hi Jay, Is there a reason why you do not want to use `urlparse` from `urllib.parse` and instead want your own regex? What you seem to care about is the `path` segment of the URL. `?K=V` are `query` segments of the URL. — Sudheesh Singanamalla, Feb 25 '22 at 22:38
Hi Sudheesh. Thank you for your response. I will certainly consider this solution. — Han Hanz, Feb 27 '22 at 13:29

score 1 · Accepted Answer · answered Feb 25 '22 at 22:43

1

Is this what you need?

https://www.website.com/books/\w{9}\?varient=web$

\w{9} - match 9 characters
\?    - match question mark
varient=web - match varient=web

answered Feb 25 '22 at 22:43

vs97

1 Answers1