0

a 3rd party tool I'm using builds an anchor tag like so..

"<a href="http://DevNode/Lists/Publications.aspx#/publication/123">http://DevNode/Lists/Publications.aspx#/publication/123</a>"

I need to isolate the href so I can trim it. Currently my pattern of

reg = /^(<a\shref=")? http:\/\/DevNode\/Lists\/Publications.aspx#\/publication\/(\d+)/i {lastIndex: 0} 

will fail to match if the href has a leading space like this

"<a href=" http://DevNode/Lists/Publications.aspx#/publication/123"> http://DevNode/Lists/Publications.aspx#/publication/123</a>"

Please help

Jeremy Nelson
  • 276
  • 8
  • 16

4 Answers4

6

If you're doing this on a browser the simplest way is to let the browser figure it out:

var div = document.createElement("div");
div.innerHTML = yourString;
var href = div.querySelector("a").href;

This also has the advantage of resolving it if it's a relative URL. If you don't want that, use getAttribute instead:

var href = div.querySelector("a").getAttribute("href");

Note that if you use getAttribute, if the attribute has a leading space, the leading space will be in the result; String#trim could be useful if you want to get rid of it.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
1

You may want to use the * quantifier that says "any number of times including zero" combined to the \s it will match spaces, newlines or else.

So use \s+ where a space is required but there might be more than one

And use \s* where a space is optional but there might be some

reg = /^(<a\s+href=")?\s*http:\/\/DevNode\/Lists\/Publications.aspx#\/publication\/(\d+)/i
Simon
  • 2,353
  • 1
  • 13
  • 28
  • i think this is what would be most helpful but it does not seem to work. – Jeremy Nelson Oct 17 '16 at 18:15
  • 1
    nvm, I was using var reg = new RegExp(pattern, 'i'); to create the pattern and therefor needed to escape the backslash character, Thank you, different from my title but indeed most helpful answer – Jeremy Nelson Oct 17 '16 at 18:28
1

Keep it simple:

var ahref = '<a href="http://DevNode/Lists/Publications.aspx#/publication/123">http://DevNode/Lists/Publications.aspx#/publication/123</a>';
var href = ahref.split('"')[1];
Dan Nagle
  • 4,384
  • 1
  • 16
  • 28
0

An easy/fast answer is using jQuery, building the tag, and looking for the href attribute.

$('<a href="http://DevNode/Lists/Publications.aspx#/publication/123">http://DevNode/Lists/Publications.aspx#/publication/123</a>')
.attr('href')

I'll try to get you the RegExp in a bit. Hang on tight...

And as promissed... here's the RegExp

var text = '<a href="http://DevNode/Lists/Publications.aspx#/publication/123">http://DevNode/Lists/Publications.aspx#/publication/123</a>';
console.log(text.match(/<a\s+(?:[^>]*?\s+)?href="([^"]*)"/)[1])
JorgeObregon
  • 3,020
  • 1
  • 12
  • 12