Background Info: I have a large body of text that I regularly encapsulate in a single string from an XML document(using LINQ). This string contains lots of HTML that I need to preserve for output purposes, but the emails and discrete HTML links that occasionally occur in this string need to be removed. An Example of the offending text looks like this:
--<a href="mailto:jsmith@email.com" target="_blank">John Smith</a> from <a href="http://www.agenericwebsite.com" target="_blank">Romanesque Architecture</a></p>
What I need to be able to do is:
- Find the following string:
<a href
- Delete that string and all characters following it through the string
>
- Also, always delete this string
</a>
Is there a way with LINQ that I can do this easily or am I going to have to create an algorithm using .NET string manipulation to achieve this?