I am parsing RSS content using Universal feed Parser. In the description tag some times I am getting velues like below:
<!--This is the XML comment -->
<p>This is a Test Paragraph</p></br>
<b>Sample Bold</b>
<m:Table>Sampe Text</m:Table>
Inorder to remove HTML elements/tags I am using the following Regex.
pattern = re.compile(u'<\/?\w+\s*[^>]*?\/?>', re.DOTALL | re.MULTILINE | re.IGNORECASE | re.UNICODE)
desc = pattern.sub(u" ", desc)
This helps to remove the HTML tags but not the xml comments. How do I remove both the elemnts and XML coments?