0

So, this is my code:

    Dim sourceString As String = New System.Net.WebClient().DownloadString("www.example.com")
    TextBox2.Text = sourceString
    Dim findtext2 As String = "(?<=<div class=""books"">)(.*?)(?=</div>)"
    Dim myregex2 As String = TextBox2.Text
    Dim doregex2 As MatchCollection = Regex.Matches(myregex2, findtext2)
    Dim matches2 As String = ""
    For Each match2 As Match In doregex2
        matches2 = matches2 + match2.ToString + Environment.NewLine
    Next
    MsgBox(matches2)

It's getting all values between <div class="books"> and </div>, but there is one big problem.

After "books", there are 3 characters (like <div class="books672">).

On example.com, the HTML is like this:

<div class="books321">Book1</div>
<div class="books785">Book2</div>
<div class="books547">Book3</div>
<div class="books182">Book4</div>
<div class="books317">Book5</div>
<div class="books970">Book6</div>

How could i get "Book1, Book2..."? Does something for random characters exist in regex?

Stefan Đorđević
  • 565
  • 1
  • 4
  • 22

1 Answers1

0

By adding \w{1} it recognizes it as one random character. In this case, i needed 3 random characters, so the solution would be:

(?<=<div class="books\w{3}">)(.*?)(?=</div>)
Stefan Đorđević
  • 565
  • 1
  • 4
  • 22