I am scraping a website which has really poor HTML structure and I am getting text like this
Example:
Creator:
\r\r
My Name
\r\r
Date created:
\r\r
123123
<br><br>
Title:
\r\r
Title here
\r\r
I want it to look like
Creator: My Name
\r\r
Date created:123123
Title:Title here
\r\r
I have this regex _str = re.sub('\r+','',_str)
But I know its wrong because it replaces all \r
Is there any way to iterate over re.sub()
? Or you have any idea in mind how do I achieve my goal?