2

I am trying to sanitize a string with HTML code. The string could contain multiple HTML tags.

I want to sanitize all of the elements in the string, except for <br> and <font> tags. By sanitizing, I want to replace them with their entities, so that the tags don't load.

I have tried to use other code supplied, but it didn't seem to work, and I couldn't figure out how to modify it to not sanitize both of the elements.

For example, when I have a string with HTML elements in it, I want to remove all the tags except for and ones.

ghosty
  • 53
  • 6

1 Answers1

0

Do something like this, use a regex to find and replace tags and use a function as your second argument to replace, in which you can check which tag you are dealing with - then if it is not a br or font-tag replace '<' with '&lt;' and '>' with '&gt;' (looks the same as '<' and '>' but are harmless).

const sanitize = html => html.replace(/<[^>]*>/g, found =>
  found.indexOf('<br') === 0 || found.indexOf('<font') === 0 ?
    found : found.replace(/</g, '&lt;').replace(/>/g, '&gt;'));

// test
sanitize('<script></script>hello<br><b>test</b><font...>');

B.t.w. the font-tag is deprecated in HTML5... First time I hear someone mention it for years...

Why, you might ask, am I using the 'g' flag - replace all occurences for < and > ? That's in case someone is sneaky and writes a < inside another <.

Thomas Frank
  • 1,404
  • 4
  • 10