-2

I have a string that contains an <a> tag with href attributes. I need to find the regex which matches only value of hrefs.

<a href="http://value.com">VALUE HERE</a> <-- string to find
<a href="www.twittor.com">TWITTOR VALUE HERE</a> <-- another string to find

I would like to get exact http://value.com or www.twittor.com. I searched the site for an answer, many solutions were found, but they all match additional information, not the value itself.

Like this one: Regex to find Href value matches href="http://value.com" and so the others.

TwittorDrive
  • 95
  • 1
  • 7
  • 2
    Cue zalgo post. I can't. I'm on mobile. Someone, please. – Jaromanda X Aug 23 '22 at 14:09
  • 1
    To satisfy Jaro's yearning: [that comment](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). – Andy Aug 23 '22 at 14:59
  • I got the point, but I just need to find the substring using pattern, it is not so complex task as parsing the whole HTML-document. By the way, what 'Cue zalgo post' is? – TwittorDrive Aug 23 '22 at 15:05

2 Answers2

2

Use a regular expression with a capturing group (enclosed in ()). Then use .exec and grab the last item from the return value of .exec:

const inputA = '<a href="http://value.com">VALUE HERE</a>';
const inputB = '<a href="www.twittor.com">TWITTOR VALUE HERE</a>';



const last = list => list[list.length - 1];
const extract = input => /href="(.*)"/g.exec(input);

console.log(last(extract(inputA)));
console.log(last(extract(inputB)));
David
  • 3,552
  • 1
  • 13
  • 24
1

Using the native DOM parser might be a viable alternative to a regex. Pass in the string, parseFromString, and then return the href attribute of the first child element in the body of the document returned by the parser.

const str1 = '<a href="http://value.com">VALUE HERE</a>';
const str2 = '<a href="www.twittor.com">TWITTOR VALUE HERE</a>';

const parser = new DOMParser();

function getHref(parser, str) {
  return parser
    .parseFromString(str, 'text/html')
    .body.firstChild.getAttribute('href');
}

console.log(getHref(parser, str1));
console.log(getHref(parser, str2));
Andy
  • 61,948
  • 13
  • 68
  • 95
  • I am afraid but there is no native DOMParser in NodeJS which I use during this task. But, thanks - I will use it at client side. – TwittorDrive Aug 23 '22 at 14:55
  • 1
    I missed the node tag :) [Here are some suggestions for node](https://stackoverflow.com/questions/11398419/trying-to-use-the-domparser-with-node-js). Side note: https://blog.codinghorror.com/parsing-html-the-cthulhu-way/ – Andy Aug 23 '22 at 14:57