0

So, i have this html code:

<div class="keyboard">
  <p>
    Hello world!
  </p>
</div>

I want to get text "Hello world!". I've tried with my regex code below, yet it didn't work.

Dim findtext2 As String = "(?<=<div class=""keyboard"">)(.*?)(?=</div>)"
Dim myregex2 As String = TextBox1.Text 'HTML code above
Dim doregex2 As MatchCollection = Regex.Matches(myregex2, findtext2)
Dim matches2 As String = ""
For Each match2 As Match In doregex2
    matches2 = matches2 + match2.ToString + Environment.NewLine
Next
MsgBox(matches2)
Cybernux
  • 1
  • 1
  • 3

1 Answers1

0

As was mentioned in comments don't use ReGex for parsing html code.
Instead use LINQ to XML

Dim html As XElement =
    <html>
        <body>
            <div class = "keyboard">
                <p>Hello word!</p>
            </div>
        </body>
    </html>

Dim values As String = 
    html.Descendants("div").
         Where(Function(div) div.Attribute("class").Value.Equals("keyboard")).
         Select(Function(div) div.Element("p").Value)

For Each value As String in values
    Console.WriteLine(value);
End For
Fabio
  • 31,528
  • 4
  • 33
  • 72
  • Gives error (on first line): **Value of type 'String' cannot be converted to 'System.Xml.Linq.XElement**. – Cybernux Sep 29 '16 at 02:54
  • Are you wrapped html code with the quotes? If so remove quotes. [XML Literals Overview (Visual Basic)](https://msdn.microsoft.com/en-us/library/bb384629.aspx) – Fabio Sep 29 '16 at 03:36
  • I've put it in TextBox and wrote **Dim html As XElement = TextBox1.Text**. – Cybernux Sep 29 '16 at 20:40
  • Then use `Dim html As XElement = XElement.Parse(TextBox1.Text);` – Fabio Sep 30 '16 at 03:07