-3

[EDIT] First of all: I am aware that by DOM manipulation tools you can easily modify DOM structures. (These are available in every browser and even in Node by third party libraries.)

The goal of his question is if someone can come up with a clever idea if this problem can be solved by Regexp in JavaScript without DOM.

Thank you![End of EDIT]

Let's say I have the following HTML snippet as a string:

<p>
    <div class="something">
        <span>
            <div class="else">My precious text.</div>
        </span>
    </div>
</p>

I wish to get rid of the <div class="something"> tags with RegExp in order to get something like this (indentation does not matter):

<p>
    <span>
        <div class="else">My precious text.</div>
    </span>
</p>

So, my attempt was:

htmlString.replace(/<div class="something">([\s\S]+?)<\/div>/gi, "$1");

But it will match for the closing tag of <div class="else"> of course.

How can I do it properly using just vanilla JS and by not using the DOM manipulation tools of the browser? (i.e. in Node)

Erik Kránicz
  • 345
  • 1
  • 2
  • 12
  • https://stackoverflow.com/q/1732348/8284239 – Joe Warner May 27 '18 at 21:07
  • There's no way to do it "properly" using just vanilla JS. Use something like [node-jsdom](https://www.npmjs.com/package/node-jsdom). – Casimir et Hippolyte May 27 '18 at 21:08
  • Well, if I wanted to use a third party library in Node to emulate DOM, I would recommend [jsdom](https://www.npmjs.com/package/jsdom). However I was just curious if there is someone who is clever enough to solve the above mentioned problem with regexp without third party library. – Erik Kránicz May 28 '18 at 08:33

1 Answers1

0

Regex isn't the best option for this. You're better off using the querying abilities of JS if it's HTML you're dealing with, here's a quick example of what you could do (oversimplified for explanation purposes):

var htmlString = '<p><div class="something"><span><div class="else">My precious text.</div></span></div></p>';

//Create a DOM element so you can query what you need and manipulate it
var newElement = document.createElement("p");
newElement.innerHTML = htmlString;

//Find what you need to remove
var toRemove = newElement.getElementsByClassName("something")[0];
//Grab what you need to keep
var toKeep = toRemove.firstChild;

//Remove the unwanted element
newElement.removeChild(toRemove);

//Append the old child
newElement.appendChild(toKeep);

//If you really want it back as a string
newElement.outerHTML;

PS: It's not valid to have a div inside a paragraph element, so you're going to get unpredictable results.

Luke Benting
  • 168
  • 11
  • Using the DOM manipulation tools of the browser this problem is piece of cake, just like you described. However what if I have to solve this without DOM support (i.e. in Node). Just like I commented above, I can use a third party library like [jsdom](https://www.npmjs.com/package/jsdom), but I was just curious if someone is clever enough to solve this problem with regexp and without the DOM support or third party library. (PS: I know that I am not supposed to put a `
    ` inside a ``, just used it for the sake of the example.)
    – Erik Kránicz May 28 '18 at 08:48