I've got problem with extracting a string from a html code (that's basically problem with regex expression). Here's the code:
string wheretosearch = @"
<td class=""name"">
<div>
<a href=""/addr1.html"" class=""link "">
<span>Title1</span>
</a></td>
[some code]
<td class=""name"">
<div>
<a href=""/addr2.html"" class=""link "">
<span>Title2</span>
</a></td>";
I want to extract titles between tags. What my problem is that I cannot put the unknown number of chars in regex (.* section after td class=""name"" ):
<td class=""name"">.*<span>(?<title>.*)</span>
To put things simply: I want regex to find <td class=""name"">
and then after unknown number of characters find first occurrence of <span>
and then take the value between that first <span>
and </span>
.
What it actually does it takes the last occurrence of <span>
and gives the last title only.
EDIT:
Okay, besides the HTML issue, the problem is like: I've got string:
"This is a text: NICE. This is a great text: NICE TOO."
I would like to take "This" then unknown number of characters, then string between ": " and "." How this could be done?
Of course I'm interested in each occurance of that complex expression, so the output would be "NICE" and "NICE TOO" in collection.
For my expression like "This.*(?<title>.*)."
i get only the "NICE TOO" string, as @urlreader mentioned, it finds the max length matched string.