I have a html text file and I am trying to remove any HTML tags in tables, i.e. remove any HTML within THE <TABLE>
and </TABLE>
tags.
However, what's really weird is that the regex that I use,
(?<=<table((?!</table).)*)<(?!/table)[^>]+>
,
works perfectly in PowerGREP or EditPad Pro, however, when applied in vb.NET (or Expresso) to the VERY SAME text, it does NOT work!
I just use a simple replace method: newString = Regex.Replace(oldString, "(?<=<table((?!</table).)*)<(?!/table)[^>]+>", string.Empty, RegexOptions.IgnoreCase)
I'm getting totally confused and am wondering if anyone can help me out and see why this is the case and what change I need to make in order for it to work in .NET. Thanks!
Below is the example text:
================
texttexetext
<TABLE>
<TAG1>
<TAG2>tabletext<TAG3>
<TAG4>
</TABLE>
texttexttext
===============
Final output in PowerGREP is
================
texttexetext
<TABLE>
tabletext
</TABLE>
texttexttext
===============