Possible Duplicate:
Parsing HTML in Python
I have a long string of HTML similar to the following:
<ul>
<li><a href="/a/long/link">Class1</a></li>
<li><a href="/another/link">Class2</a></li>
<li><img src="/image/location" border="0">Class3</a></li>
</ul>
It has several list entries (Class1 to Class8). I'd like to turn this into a list in Python with only the class names, as in
["Class1", "Class2", "Class3"]
and so on.
How would I go about doing this? I've tried using REs, but I haven't been able to find a method that works. Of course, with only 8 classes I could easily do it manually, but I have several more HTML documents to extract data from.
Thanks! :)