The Regex below is one among many that you could use.
It uses zero-width positive look-behind (?<=)
and look-ahead (?=)
assertions to locate the target string.
Dim str As String = _
"<p>Rubriek:" & vbCrLf &
" <a href=""http://www.detelefoongids.nl/juwelier/4-1/?oWhat=Juwelier""" & vbCrLf &
" title = ""Juwelier""" & vbCrLf &
" class=""category"">" & vbCrLf &
" Juwelier" & vbCrLf &
" </a>" & vbCrLf &
"</p>"
Dim match As Match = Regex.Match(str, _
"(?<=<p>Rubriek:[^>]+?class=""category"">\W*)\w+(?=\W*</a>)")
If (match.Success) Then
MsgBox(match.Value)
End If
Although not used above, an important thing to remember when trying to match over multiple lines is to use Single-line mode if you are going to use the wild-card metacharacter .
, so that it matches every character including new-lines. This can be specified using RegexOptions.Singleline
or by putting (?s)
at the start of the Regex.
\w+
is used to match one or more word characters, i.e. a-zA-Z0-9_
\W*
is used to match zero or more non-word characters.
[^>]
is used to match characters that are not >
.