0

I have an HTML page source which is in string format on the server-side

I need to extract a from the string and add it to an array.There can be multiple links with the same starting tag. i need to push the extracted string to an array

the <link rel="icons"................ > can contain anything inside the tag.I have mentioned the startTag and endTag in the code below.

  var startTag = '<link rel="icons"';
  var endTag = '>';
  const re = new RegExp('(' + startTag + ')(.|\n)+?(' + endTag + ')', 'g');

However, When i console the value of re, it is not the one I expect.

DesiredOutput

['<link rel="icons" href="icons1.png"','<link rel="icons" href="icons2.png"',<link rel="icons" href="icons3.png"]

Thanks in advance.

eric.joy
  • 35
  • 1
  • 1
  • 5

1 Answers1

0

I think you're looking for something like this (the replace is just to remove extra whitespace):

const data = `
  <link 
    rel="icons"
    href="icons1.png"
  >
  <link 
    rel="icons"
    href="icons2.png"
  >
  <link 
    rel="icons"
    href="icons3.png"
  >
`;

const links = data.match(/<link.*?>/gs)
  .map(link => link.replace(/\s+/g, ' '));

console.log(links);

If you're in an environment that doesn't support the s flag, you could use /<link[^]*?>/g instead.

Scott Rudiger
  • 1,224
  • 12
  • 16
  • hi @scott What if I have to just get the href value from the links in the map for the elements which match the condition For example:["icons1.png","icons2.png",""icons3.png""] – eric.joy Mar 08 '19 at 07:25
  • @eric.joy In that case, you don't need the `map` you could just use this regex: `const links = data.match(/(?<=href=")\w+\.\w+/g);` [This page](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp) is useful for this kind of stuff. – Scott Rudiger Mar 08 '19 at 12:48
  • If that answers your question, please don't forget to accept the answer. :) – Scott Rudiger Mar 08 '19 at 12:53
  • In this scenario, it works perfectly fine as it only returns an array of links which consists of /gs) .map(link => link.replace(/\s+/g, ' ')); I would also like to take out the href from the links which have rel="icons" present and discard any other href(s) which doesn't have the rel="icons" condition – eric.joy Mar 10 '19 at 07:10