0

I have a string like this:

'Hello, my name is <a href="https://stackoverflow.com/foo/bar">foobar</a>
My favorite food is <a href="https://google.com/food/pizza">pizza</a>'

I want to use javascript so all the html links to be replaced with markdown links like this:

'Hello, my name is [foobar](https://stackoverflow.com/foo/bar)
My favorite food is [pizza](https://google.com/food/pizza)'

How can I accomplish this? I know regex might be the answer but I'm not sure how to use it to solve my problem. Thanks in advance.

CubeyTheCube
  • 11
  • 1
  • 5
  • 2
    no, i don't think regex is the answer here. It's just a matter of reading some dom nodes and replacing them. not that big effort. Search for "how to get href value" and "how to get href text" to get started – Lelio Faieta Dec 30 '20 at 18:58

1 Answers1

2

As pointed by Lelio, using regex to parse HTML is not a good idea. You can create a dom node and get href and innerText and then replace it.

Below is a snippet.

Steps

  1. Create a dom node. Here, I used p.
  2. Add the text as innerHTML of the node.
  3. Use querySelectorAll to get all a tags in the node.
  4. Iterate over the results and replace the outerHTML contents with innerText and href attribute of a tag.

let text = `Hello, my name is <a href="https://stackoverflow.com/foo/bar">foobar</a>
My favorite food is <a href="https://google.com/food/pizza">pizza</a>`;
let p = document.createElement('p');
p.innerHTML = text;
let links = p.querySelectorAll('a');
links.forEach(x => {
    text = text.replace(x.outerHTML, "["+x.innerText+"]("+x.href+")");
});
console.log(text);

Recommended reading: Why you should not use Regex to parse html

If there's a reason for you to stick with regex, here's a rough regex. /<a\shref=\"([^"]*)">([^<]*)<\/a>/igm

let text = `Hello, my name is <a href="https://stackoverflow.com/foo/bar">foobar</a>
My favorite food is <a href="https://google.com/food/pizza">pizza</a>`;
console.log(text.replace(/<a\shref=\"([^"]*)">([^<]*)<\/a>/igm,(match, url,text) => "["+text+"]("+url+")"))

Note that this regex won't work if a tag have more attributes than href or if the a tag have html elements as children.

Sagar V
  • 12,158
  • 7
  • 41
  • 68
  • @CubeyTheCube The regex part contains a snippet. It's hidden by default. There are lot of libraries available in npm which allows you to work with dom in nodejs – Sagar V Dec 30 '20 at 19:31