0

I have a line of code:

<link href="<%= Page.ResolveClientUrl("~/Styles/CAR.css") %>" rel="stylesheet" type="text/css" />

I just want to extract ~/Styles/CAR.css from it. Kindly let me know Regex for this. link href tag may contain other syntax as well to refer css. For Ex, <link href="<%= Url.Content("~/Styles/CAR.css") %>" rel="stylesheet" type="text/css" />

Vijay
  • 65,327
  • 90
  • 227
  • 319
Kabir
  • 99
  • 1
  • 11
  • I have a line of code: " rel="stylesheet" type="text/css" /> I just want to extract "~/Styles/CAR.css" from it. Kindly let me know Regex for this. link href tag may contain other syntax as well to refer css. For Ex, " rel="stylesheet" type="text/css" /> – Kabir Jan 27 '14 at 13:07
  • I was under the impression that bad things happen when you try to parse html with regex... http://stackoverflow.com/a/1732454/24908 =) – Aaron Palmer Jan 27 '14 at 13:24
  • `Page.ResolveClientUrl` and `Url.Content` don't necessarily result in the same output, so how would this be reliable? Why don't you parse the generated HTML instead of server-side code? – nmclean Jan 27 '14 at 13:33

4 Answers4

2

I suggest you to use HtmlAgilityPack (available from NuGet) for HTML parsing. Getting href attribute value will look like:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(stringWithHtml);
var link = doc.DocumentNode.SelectSingleNode("//link[@href]");
var href = link.Attributes["href"].Value;

Then you can extract ~/Styles/CAR.css from content of attribute. Regex is good here, but you also can avoid it:

int startIndex = href.IndexOf('"');
int endIndex = href.LastIndexOf('"');
var result = href.Substring(startIndex + 1, endIndex - startIndex - 1);
// ~/Styles/CAR.css

Extracting path with regex will look like

var match = Regex.Match(href, @"ResolveClientUrl\(""(.*)""\)");
if (match.Success)
    result = match.Groups[1].Value;
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
1

Besides that you should'nt parse HTML with regex, I'd go for

\(\"(.+)\"\)

as your regex. Simply extract anything between (" and ").

For example:

string strRegex = @"\(\""(.+)\""\)";
Regex myRegex = new Regex(strRegex, RegexOptions.None);
string strTargetString = @"<link href=""<%= Page.ResolveClientUrl(""~/Styles/CAR.css"") %>"" rel=""stylesheet"" type=""text/css"" />";

foreach (Match myMatch in myRegex.Matches(strTargetString))
{
  if (myMatch.Success)
  {
    // Add your code here
  }
}

(example code taken from http://regexhero.net/tester/)

If there will be only one occurence of <link href=""<%= Page.ResolveClientUrl(""~/Styles/CAR.css"") %>"" rel=""stylesheet"" type=""text/css"" /> or you want to get only the first occurence, then you can get rid of the for-loop and use:

string strRegex = @"\(\""(.+)\""\)";
Regex myRegex = new Regex(strRegex, RegexOptions.None);
string strTargetString = @"<link href=""<%= Page.ResolveClientUrl(""~/Styles/CAR.css"") %>"" rel=""stylesheet"" type=""text/css"" />";

Match myMatch = myRegex.Match(strTargetString);

The difference here is using Regex.Matches(string) (which returns a MatchCollection; every matched occurence) vs Regex.Match(string) (which returns a single Match; the first matched occurence only).

Community
  • 1
  • 1
KeyNone
  • 8,745
  • 4
  • 34
  • 51
  • I dont want to use foreach loop here. Is there anything, where I can assign "myRegex.Matches(strTargetString))" value to "myMatch" directly. When I am using "Match myMatch = myRegex.Matches(strTargetString);", I am getting null value in myMatch. – Kabir Jan 28 '14 at 11:42
  • @Kanaiya well `Matches(string)` returns a [`MatchCollection`](http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchcollection.aspx), so every match. If you only want one single match, use [`Match(string)`](http://msdn.microsoft.com/en-us/library/twcw2f1c.aspx), this will return the first occurence (if any). If there is only one occurence of ` – KeyNone Jan 28 '14 at 12:31
  • Thanks a lot. That works, But i GET {("~/Styles/CAR.css")}. I just need ~/Styles/CAR.css to get extracted. I tried editing your Regex, but still, i am unable to get desirable output. Pls let me know the Regex to get output value. – Kabir Jan 29 '14 at 06:12
  • @Kanaiya most likely this is NOT a problem with the regex. The **whole match** is `("~/Styles/CAR.css")` since we _match_ on `("` and `")`, too! You only want to extract the first [capturing group](http://www.regular-expressions.info/brackets.html) only! This could be done with `myMatch.Groups[1].Value` (but be careful: the [Groups](http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.match.groups.aspx)-property won't have two (or more) members when the regex fails to match!) – KeyNone Jan 29 '14 at 08:37
0

use this:

/\(([^\)]*)\)/

Tested with perl:

> cat temp
<link href="<%= Page.ResolveClientUrl("~/Styles/CAR.css") %>" rel="stylesheet" type="text/css" />
> perl -lne 'print $1 if(/\(([^\)]*)\)/)' temp
"~/Styles/CAR.css"
> 
Vijay
  • 65,327
  • 90
  • 227
  • 319
0

<link href="<%= Page.ResolveClientUrl("~/Styles/CAR.css") %>" rel="stylesheet" type="text/css" />

For quick regex we can use the information inside the quotes inside parentheses (("~/Styles/CAR.css") ) and use that info to group it into one one.

Escaping the parentheses one quick regex would be

<link href="<%=.*\("(.*)\).*%>"(.*/>)

In the above regex there are two groups. The first matched group would give us the required information i.e. ~/Styles/CAR.css.

You can check it in http://regexpal.com/ and experiment with other patterns.

NavyCody
  • 492
  • 2
  • 10