I'm trying to extract some lines from an HTML source file. The one below is simplified but it's the same idea. Using the sample below, I am trying to get it to output in numerical order...that is Form 1, Form 2, Form 3, Form 4. The problem is that the second loop restarts at the second round. So I get: Form 1, Form 2, Form 3, Form 2. How can I edit so that the second loop continues to extract the Form 4 text?
Code
import re
line = 'bla bla bla<form>Form 1</form> some text...<form1>Form
2</form1> more text?bla bla bla<form>Form 3</form> some text...
<form1>Form 4</form1> more text?'
for match in re.finditer('<form>(.*?)</form>', line, re.S):
print match.group(1)
for match1 in re.finditer('<form1>(.*?)</form1>', line, re.S):
print match1.group(1)
break