I need to sort a html string so I get the content I need. Now I need to loop through the table rows in a table that have an ID. How do I do this with a regex?
Asked
Active
Viewed 1,496 times
1

Brian Tompsett - 汤莱恩
- 5,753
- 72
- 57
- 129

Dejan.S
- 18,571
- 22
- 69
- 112
-
1see http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Manu Jan 18 '10 at 10:04
4 Answers
1
Regular expressions cannot be used to parse HTML; HTML is not regular. Use a proper HTML parser library.

Ignacio Vazquez-Abrams
- 776,304
- 153
- 1,341
- 1,358
1
It depends on how regular the HTML text is. For example, given this table:
<table>
<tr><td>1</td><td>Apple</td></tr>
<tr><td>2</td><td>Ball</td></tr>
<tr><td>3</td><td>Cookie</td></tr>
<table>
The following regex expression finds the IDs in the first column:
(?<=<tr><td>).*?(?=</td>)

Mike Hanson
- 1,059
- 2
- 10
- 21
0
Try this
Dim HTML As String = contentText
Dim options As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.Singleline
Dim regex As Regex = New Regex("<table[^>]*>(.*)</table>", options)
Dim match As MatchCollection = regex.Matches(HTML)
Dim sb As StringBuilder = New StringBuilder
For Each items As Match In match
sb.Append(items.ToString & vbLf)
Next
TextBox.Text = sb.ToString
0
If you run the page through an html-parser like BeautifulSoup, then you can prettify it so that this kind of regex has a chance. But if you are parsing the html anyway...

Charles Stewart
- 11,661
- 4
- 46
- 85