0

I have the following Regex that searches for a Tags h1, h2, ..., h5 and returns a match with group named TagName holding the Tag Name and group named TagValue holding the Tag Value.

 Public Sub Main
    Dim strSearched = <html>
                          <head>
                              <title>This is a test</title>
                          </head>
                          <body>
                              <h1>DA:TG01</h1>
                              <p>First paragraph</p>
                              <h2>This is a test 2</h2>
                              <!--More boring stuff omitted-->
                          </body>
                      </html>.ToString

    Dim ResultString As String
    Dim myMatchEvaluator As MatchEvaluator = New MatchEvaluator(AddressOf ComputeReplacement)

    ResultString = Regex.Replace(strSearched,
                                 "<(?'TagName'h[1-5])>(?'TagValue'.*?)</\k<TagName>>",
                                 myMatchEvaluator,
                                 RegexOptions.Singleline Or RegexOptions.IgnoreCase)


End Sub

Public Function ComputeReplacement(ByVal m As Match) As String
    ' Need to replace the Group('value') here

    Return strRetValue
End Function

In the Function ComputeReplacement, I need to replace the Group("TagValue") with another value and return back the match string, eg:

If the match was <h1>AAA</h1> I would need it to return <h1>BBB</h1> while if the match was <h2>AAA</h2> I would need it to return <h2>BBB</h2>

JPScerri
  • 115
  • 1
  • 1
  • 8
  • [you should consider a DOM parser instead of regexes](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – Martin Ender Sep 26 '12 at 13:51
  • If you fixed this yourself could you add an answer explaining what you did? – Kev Sep 27 '12 at 00:17
  • No, unfortunately I did not fix it. For the above example I am just re building the string manually. But I have more complex regex, in which case I have to make a separate `ComputreReplacement` function for each. – JPScerri Sep 27 '12 at 06:03

1 Answers1

1

You should probably use something to convert to XML and use xpath, you could use one of these solutions:

HtmlAgilityPack : http://htmlagilitypack.codeplex.com SGMLReader : http://developer.mindtouch.com/SgmlReader

Yshayy
  • 325
  • 2
  • 10