So, this is my code:
Dim sourceString As String = New System.Net.WebClient().DownloadString("www.example.com")
TextBox2.Text = sourceString
Dim findtext2 As String = "(?<=<div class=""books"">)(.*?)(?=</div>)"
Dim myregex2 As String = TextBox2.Text
Dim doregex2 As MatchCollection = Regex.Matches(myregex2, findtext2)
Dim matches2 As String = ""
For Each match2 As Match In doregex2
matches2 = matches2 + match2.ToString + Environment.NewLine
Next
MsgBox(matches2)
It's getting all values between <div class="books">
and </div>
, but there is one big problem.
After "books", there are 3 characters (like <div class="books672">
).
On example.com, the HTML is like this:
<div class="books321">Book1</div>
<div class="books785">Book2</div>
<div class="books547">Book3</div>
<div class="books182">Book4</div>
<div class="books317">Book5</div>
<div class="books970">Book6</div>
How could i get "Book1, Book2..."? Does something for random characters exist in regex?