-1

My html as below

 <body><table><tr><td> <h4><span><strong><span>This Text</span></strong></span></h4> </td> <td> <h4><span><strong>That Text<br></strong></span></h4> </td> <td> <h4><span><strong><span>Some Text</span></strong></span></h4> </td> <td><span><strong>0 505 253 56 13</strong></span></td></tr><table></body>

The following regex cleans all tags without a|p|img

 _QsHtml = Regex.Replace(_QsHtml, @"<\/?(?!a)(?!p)(?!img)\w*\b[^>]*>","", RegexOptions.Multiline);

I would like to apply ragex for only table row (td). I have tried following regex but not matched.

<\/?td(?!a)(?!p)(?!img)\w*\b[^>]*td>
Kerberos
  • 331
  • 1
  • 11

1 Answers1

0

OK I have solved problem with combine Html Agility full working codes as below

string _QsHtml =  @"<body><table><tr><td> <h4><span><strong><span>This Text</span></strong></span></h4> </td> <td> <h4><span><strong>That Text<br></strong></span></h4> </td> <td> <h4><span><strong><span>Some Text</span></strong></span></h4> </td> <td><span><strong>0 505 253 56 13</strong></span></td></tr><table></body>";

var _HtmlDocument = new HtmlAgilityPack.HtmlDocument();
_HtmlDocument.OptionFixNestedTags = true;
_HtmlDocument.OptionAutoCloseOnEnd = true;
_HtmlDocument.OptionWriteEmptyNodes = true;
_HtmlDocument.LoadHtml(_QsHtml);

 var FindTableRows = _HtmlDocument.DocumentNode.SelectNodes("//td");

 if (FindTableRows != null)
 {
     foreach (var TableRow in FindTableRows.ToList())
     {
         string _InnerHtml = TableRow.InnerHtml;

        _InnerHtml = Regex.Replace(_InnerHtml,
                               @"<\/?(?!a)(?!br)(?!img)\w*\b[^>]*>",
                               "", RegexOptions.Multiline);
        TableRow.InnerHtml = _InnerHtml;

     }
 }
Kerberos
  • 331
  • 1
  • 11