I'm looking for a good function to remove HTML from a string of HTML. Ideas?
Asked
Active
Viewed 1,073 times
1
-
do you want to remove or to escape HTML? – DennyRolling Nov 06 '10 at 21:27
-
Trying to remove it. I know this could result in some strange strings, but that's what I need to do with the system I'm integrating with. Thanks. – Paul Fryer Nov 06 '10 at 21:32
-
Similar question: http://stackoverflow.com/questions/787932/using-c-regular-expressions-to-remove-html-tags – eldarerathis Nov 06 '10 at 21:39
-
possible duplicate of [How to extract text from resonably sane HTML?](http://stackoverflow.com/questions/2113651/how-to-extract-text-from-resonably-sane-html) – Wim Coenen Nov 06 '10 at 21:42
2 Answers
6
I have not extensively tested this but found it a while back and has worked for my needs:
public static string StripTags(string html) {
System.Text.RegularExpressions.Regex objRegExp = new System.Text.RegularExpressions.Regex("<(.|\\n)+?>");
return objRegExp.Replace(html, "");
}

Anthony Greco
- 2,885
- 4
- 27
- 39
-
It is worthwhile to note that due to the nature of HTML, it is impossible to write perfectly complete regular expressions to parse HTML. See [here](http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not) and [here](http://stackoverflow.com/questions/133601). – Phil Hunt Nov 06 '10 at 21:37
-
ok... converted it using [http://www.developerfusion.com/tools/convert/vb-to-csharp/]. Probably one of the best tool I use daily when google-ing code samples cause i always fine C# examples for things I need in VB.net – Anthony Greco Nov 06 '10 at 21:39
-
@Anthony Thanks for the info, even if it is not C#, I can easily convert from VB to C#. That basically worked for what I'm trying to do. – Paul Fryer Nov 06 '10 at 22:49
-
np man. Like they said reg expressions wont always work (or any solution for that matter), especially because u will commonly find times programmers forgot to close their tags / etc, but it does work for the majority of situations. Glad i was able to help. – Anthony Greco Nov 06 '10 at 23:00