0

I m trying to remove everything inside any html tag:

input:

<a class="yoyo"> <h1 id="test"> hello </h1> </a>

weird example but well

output:

<a><h1>hello</h1></a>

I ve tried /(<\w)(?:.*)(>)/gmi but its not working...

If you have any clue on that . Thanks


So to explain it more after your comment I scrapped a website and I have a .txt file that i wanna clean. It will contain the whole html of a page and i want to clean every single html tag and remove the spaces aswell. So everything between any <* and > should be removed.

kikiwie
  • 390
  • 1
  • 3
  • 12
  • Can you elaborate? you can have a million html tags inside html tags, what do you want to do exactly? – James Jul 06 '17 at 15:40
  • does the regex have to change "hello" into "salut" and remove spaces too? Just kidding for the hello, but i am serious for the spaces.. – Kaddath Jul 06 '17 at 15:41
  • Don't do this with regex. – Johan Karlsson Jul 06 '17 at 15:42
  • 3
    Why don't you use JS utilities (which are built particularly well for parsing and manipulating markup) rather than use regular expression ([terrible choice for parsing HTML](https://stackoverflow.com/a/1732454/1612146)) – George Jul 06 '17 at 15:42
  • Obligatory link to "that post" George!!! *shakes fist* – James Jul 06 '17 at 15:43
  • are you working with a string or actual html? If the second, it would be a better approach to just remove the attributes with `element.removeAttribute(attrName);` – Kaddath Jul 06 '17 at 15:44
  • **Don't Parse HTML With Regex** – Tom Lord Jul 06 '17 at 15:55

2 Answers2

0

How about the following regex:

<[^>]*>

You will have to concatenate all the matches.

Uri Y
  • 840
  • 5
  • 13
0

Do this using DOM methods. Loop over all elements, iterate the attributes and remove them

let cont = document.getElementById('demo-container'),
  els = cont.querySelectorAll('*');

[].slice.call(els).forEach(el => {
  [].slice.call(el.attributes).forEach(attr => {
    el.removeAttribute(attr.name);
  })
});

console.log(cont.innerHTML)
<div id="demo-container">
  <a class="yoyo">
    <h1 id="test"> hello </h1>
  </a>
</div>
charlietfl
  • 170,828
  • 13
  • 121
  • 150