Not sure about how /?(.+) works in my regex

Asked Feb 27 '20 at 18:03

Active Feb 27 '20 at 18:03

Viewed 21 times

I would like to capture 3 groups(protocol,domain,page path)from this URL: http://www.interactivedynamicvideo.com/ I made this regex pattern pattern = r‘(\w+)://([\w.-]+)/?(.+)’. Then, since this URL is one of data of my series, I used series.str.extract(pattern) to capture groups. I expected to get http for group 1, www.interactivedynamicvideo.com for group 2, and nothing for group 3. However, I got / in group 3. I thought that / is matched at /?. Could someone explain why / is included in (.+) instead of being matched at /? ?

Thank you for your time for this

asked Feb 27 '20 at 18:03

Sherlock_Hound

`.+` requires at least 1 char. Change to `.*`, see https://regex101.com/r/3fCIkF/1 – Wiktor Stribiżew Feb 27 '20 at 18:05

Not sure about how /?(.+) works in my regex

0 Answers0