I'm working on a corpus of email messages, and trying to replace all html tags in the corpus with the string ''. How can I replace all html tag using the fact that they begin with >< and end with > ?
Example:
<html>
<body>
This is some random text.
<p>This is some text in a paragraph.</p>
</body>
</html>
Should be translated to:
<html>
<html>
This is some random text.
<html>This is some text in a paragraph.<html>
<html>
<html>
Thanks