0

How to change this:

Regex.Match(value, @"href=\""(.*?)\""",RegexOptions.Singleline);

So that it will select href='foobar' (Single quotes ') AS well as selecting href="foobar" (Double quotes ")??

Sam Hoole
  • 71
  • 1
  • 10
  • Check [this question](http://stackoverflow.com/questions/30659022/regex-for-extract-url-from-string-fails-when-string-contains-multiple-double-quo). The regex in the question should work for you. – Wiktor Stribiżew Apr 18 '16 at 20:06
  • Do you specifically only want to select href='foobar' ? – sachin k Apr 18 '16 at 20:10
  • If you want to parse out a href links from HTML, see a [snippet here](http://stackoverflow.com/questions/30629793/c-sharp-regular-expression-for-finding-links-in-a-with-specific-ending) showing how you can do that with HtmlAgilityPack. – Wiktor Stribiżew Apr 18 '16 at 20:16
  • I tried installing the HtmlAgilityPack earlier in the day Wiktor via NuGet, however when installing it just said "Cannot install, Corrupted data" or something similar – Sam Hoole Apr 18 '16 at 20:49
  • I have not had any trouble. You can retry later. If you want I can post my answer describing how to use HtmlAgilityPack to get the `href`s. – Wiktor Stribiżew Apr 18 '16 at 20:59
  • It is okay, thank you very much for your help though Wiktor ^_^ I will retry, but I will probably not have need for htmlagility after this – Sam Hoole Apr 18 '16 at 21:24

1 Answers1

2

You can use a pattern like this:

href=(["'])(.*?)\1

This will match any string of that contains href= followed by a " or ' followed by any number of characters (non-greedily) followed by the same character that was matched previously in group 1. Note that \1 is a backreference.

Also note that this will also mean the contents of your attribute will be captured in group 2 rather than group 1.

Now, the correct way to escape this string literal would be either like this (using regular strings):

Regex.Match(value, "href=([\"'])(.*?)\\1", RegexOptions.Singleline);

Or like this (using verbatim strings):

Regex.Match(value, @"href=([""'])(.*?)\1", RegexOptions.Singleline);
p.s.w.g
  • 146,324
  • 30
  • 291
  • 331