0

Supposedly I have a string like this that will go to my HTML:

<div>Wakanda Forever</div> <span class="movie">Black Panther</span>
Movies movies movies 
<span class="movie">Spider man...

The last span tag isn't closed.

In Regular expression and JavaScript, how can I remove the unclosed <span class="movie"> from <span class="movie">Spider man... or close it with a </span> tag?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Rongeegee
  • 866
  • 3
  • 10
  • 30

1 Answers1

3

Using regular expressions to do any sort of HTML manipulation is almost always a bad idea.

My recommended solution would be to do what the browser does: Parse the string into a DOM (in a similar fuzzy, forgiving way) and then turn that DOM back into HTML.

In a browser environment, this is especially easy because you can let the browser itself do it for you, by writing the bad HTML into innerHTML of an element and then reading it back - and the browser will have fixed it for you:

const badHtml = `
<div>Wakanda Forever</div> <span class="movie">Black Panther</span>
Movies movies movies 
<span class="movie">Spider man...
`

const element = document.createElement('i')
element.innerHTML = badHtml
const result = element.innerHTML

console.log(result)

In node.js, you could instead use a library like cheerio:

import cheerio from 'cheerio'

const badHtml = `
<div>Wakanda Forever</div> <span class="movie">Black Panther</span>
Movies movies movies 
<span class="movie">Spider man...
`

const $ = cheerio.load(badHtml)
const result = $.html()

console.log(result)
CherryDT
  • 25,571
  • 5
  • 49
  • 74
  • If I create this "element" by using 'document.createElement('i')' without adding it to the DOM, what will happen? Do you need to deference "element" since it's the html return from innerHTML that I need. I don't need the element created. – Rongeegee Apr 28 '22 at 16:51
  • If I put these code in a function, after the function is exited, the "element" will be dereferenced? – Rongeegee Apr 28 '22 at 17:01
  • JavaScript has a garbage collector. Since the element isn't used anywhere afterwards (and also isn't part of some function's closure like in an event listener), it will eventually just be thrown away, there is nothing you need to do about it. (Of course you could try to optimize this a bit by creating the element outside of the function and reusing it every time.) – CherryDT Apr 28 '22 at 18:27