I have a sitemap XML file and I want to run a script which extracts all the urls and print it. I have tried re.findall(r'(https?://\S+)', url)
but this prints the closing tags like : "https://www.tutorialspoint.com/python/python_reg_expressions.htm /liv"
I don't want to print the suffix ' /liv ' how do I implement this using regex ?