1

I would like to capture 3 groups(protocol,domain,page path)from this URL: http://www.interactivedynamicvideo.com/ I made this regex pattern pattern = r‘(\w+)://([\w.-]+)/?(.+)’. Then, since this URL is one of data of my series, I used series.str.extract(pattern) to capture groups. I expected to get http for group 1, www.interactivedynamicvideo.com for group 2, and nothing for group 3. However, I got / in group 3. I thought that / is matched at /?. Could someone explain why / is included in (.+) instead of being matched at /? ?

Thank you for your time for this

0 Answers0