I would like replace some html empty tag like <. /> (. is b, h1, ... but not br, hr, ...)
I think to : Regex.Replace(myString, "<..? />", "") but I don't know how can I exclude br and hr.
Anybody can help me?
Thx!
I would like replace some html empty tag like <. /> (. is b, h1, ... but not br, hr, ...)
I think to : Regex.Replace(myString, "<..? />", "") but I don't know how can I exclude br and hr.
Anybody can help me?
Thx!
If you know which tags you want to do, you could do it like this:
Regex.Replace(myString, "<(b|p|div|span) />", "")
Within the brackets, all options are pipe-delimited.
Try something like this:
(?:< *)(?!(?:br|hr)) *\w+ *\/ *\>
Add any tags to br|hr part(delimit them using '|') that you don't want to match.
Use a pattern like this to match and replace them:
<(TAG1|TAG2|TAG3|...)\s*/?>
where (TAG1|TAG2|TAG3|...)
is all the tags you want to handle, separated by pipes. Be sure to also specify that the regular expression should be case-insensitive, since HTML tags are case-insensitive. For example, to recognize just the two you listed, you could create a regex like this:
var exp = new Regex("<(b|h1)\s*/?>", RegexOptions.IgnoreCase);
How it works:
\s*
recognizes zero or more whitespace characters. (One of these isn't needed at the start of the regex, because the html standard doesn't allow whitespace before the tag name.)/?
optionally matches a '/'. (This is just to be flexible about handling HTML that doesn't use the /
in empty tags, since the HTML spec didn't always require it.)You can use it to remove tags like so:
var strippedText = exp.Replace(input, String.Empty);