0

I would like to extract urls from a regex (not all urls, only some via my regex).

I tried Regex.Match

string html = request.Get(
    "http://www.bing.com/search?q=" + keyword + "&first=1"
).ToString();
Match urls = Regex.Match(html, "<h2><a href=\"(.*?)\"");

it only displays one URL, I would like to have all the URLs

EDIT : for people who have had this problem, here is the solution

string pattern = @"<a href=""([^""]+)";
                                Regex rgx = new Regex(pattern);

                                foreach (Match match in rgx.Matches(html))
                                    Console.WriteLine("Found '{0}' at position {1}", match.Value, match.Index);
mr spysix
  • 3
  • 2

1 Answers1

0

In order to get all URLs you would need to remove <h2> tag.

Try pattern: <a href="([^"]+)

Explanation:

<a href=" - match literally <a href="

([^"]+) - match one or more of characters other than " and store it into first capturing group

In order to get all URLs you need to call Matches method and then loop through them using Groups property:

foreach(var match in Regex.Matches(html, "<a href=\"([^\"]+)")
{
  // get url from first capturing group
  string url = match.Groups[1];
  // ...
}
Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69