I'm new to regular expression/Python, but I'm trying to extract a revision number from an HTML page. I used a proxy and urllib to store the read page into a string. I have some text that looks like:
<p>Proxy 3.2.1 r72440<br>
SlotBios 11.00</p>
<p><strong><span style="color: rgb(255, 0, 0);">Random Text 4.23.6 r98543<br>
...</tr>...
<p><strong><span style="color: rgb(255, 0, 0);">Random Text 4.33.6 r98549<br>
I want to parse the text and extract the revision numbers corresponding to lines of red. So in this example, I want to parse out 98543 and 98549.
I'm able to parse out all the lines generally with:
paragraphs = re.findall(r'r(\d*)<br>',str(html))
However, I'm a little stuck on how to do it such that I can find only the red lines. My current code would also include 72440. Any idea how to get around this? Thanks!