4

I have some code to remove all html tag but I want to remove all html but except </td> and </tr> tags.

How can this be done?

public string HtmlStrip( string input)
{
    input = Regex.Replace(input, "<input>(.|\n)*?</input>", "*");
    input = Regex.Replace(input, @"<xml>(.|\n)*?</xml>", "*"); // remove all <xml></xml> tags and anything inbetween.  
    return Regex.Replace(input, @"<(.|\n)*?>", "*"); // remove any tags but not there content "<p>bob<span> johnson</span></p>" becomes "bob johnson"
}
p.campbell
  • 98,673
  • 67
  • 256
  • 322

2 Answers2

6

Regex is not great for parsing XML or HTML. Take a look at the HTML Agility Pack

HTML Agility Pack

Kevin Anderson
  • 589
  • 4
  • 17
0

remove all tag html but except tag td and tr

input = Regex.Replace(input, @"<(?!td|/td|tr|/tr).*?>", "");

Huy
  • 1
  • 1
    Use the code formatting feature for better readability. Explain what is the purpose of the code you mentioned in the context of your question. – Stephane Oct 23 '18 at 09:00