RegEx 5 Strings to Match - Replace Based on Match

Question

Angular/JS Application

I have this: input.replace('/<|>|"|&|'/gm', need this to be based on match value).

So I want to search by all those strings - but I want to replace the value based on which one matched. So if " matches = replace with " and if > matches = replace with >

I basically want to avoid this:

input.replace('/</gm', <)

input.replace('>/gm', >)

input.replace('"', ")

I think it has something to do with capturing groups - not a regex person.

Maybe the answer can only be: inputString.replace('/</gm', '<').replace('/>/gm', '>').replace('/"/gm', '"').replace('/&/gm', '&').replace('/'/gm', '\'');

Your question is unclear. What do you mean by "< if #1"? What does "#1" refer to? — bgfvdu3w, Jun 29 '22 at 17:44
@bgfvdu3w I want to search by all those strings - but I want to replace the value based on which one matched. So if " matches = replace with " and if > matches = replace with > — RooksStrife, Jun 29 '22 at 17:45
Does this answer your question? [Unescape HTML entities in JavaScript?](https://stackoverflow.com/questions/1912501/unescape-html-entities-in-javascript) — bgfvdu3w, Jun 29 '22 at 17:52
@bgfvdu3w no I need to specifically match certain strings and replace them with corresponding values. I update the question - it might help. — RooksStrife, Jun 29 '22 at 17:56
Well, the strings you are trying to replace look like they are HTML entities. You're effectively trying to unescape/decode them. That's what I linked a solution to. Do you have other strings that are not HTML entities which need to be replaced too? — bgfvdu3w, Jun 29 '22 at 17:58
@bgfvdu3w they will only be the ones from above. But I want to understand how I could do it with capturing groups (I think that's what it is called). Not just how to encode them. — RooksStrife, Jun 29 '22 at 18:03

Luatic · Accepted Answer · 2022-06-29T18:06:09.600

What's commonly done is to simply chain the replacements, executing one after another as in your example:

input.replace(/&lt;/g, "<").replace(/&gt;/g, ">").replace(/&quot;/g, '"').replace(/&amp;/g, "&").replace(/&apos;/g, "'")

the downside of this it that it really doesn't scale well: Each replace operation runs in linear time. Thus for m replacement and a string of length n, the time complexity will be O(n * m). If you now were to implement support for all 2k+ named HTML entities, this would quickly blow up and your performance would degrade severely - not to mention the O(m) garbage strings that are created in the process, making for O(n * m) garbage data.

The proper way is to create a lookup table (a hash table, called a dictionary in JS) with O(1) access with all the named entities and their replacements:

const namedEntities = {lt: "<", gt: ">", quot: '"', amp: "&", apos: "'"}
return input.replace(/&(lt|gt|quot|amp|apos);/g, (_, match) => namedEntities[match])

this passes a replacement function to String.replace; no garbage strings are created and the time complexity - assuming an ideal RegEx implementation - is O(n).

If you want to religiously follow DRY, you might want to build the RegEx from the keys:

const regex = new RegExp("&(" + Object.keys(namedEntities).join("|") + ");", "g")
return input.replace(regex, (_, match) => namedEntities[match])

alternatively, consider using a more general RegEx, leveraging the dictionary to check whether an entity is valid and defaulting to no replacement:

return input.replace(/&(.+?);/g, (entity, match) => namedEntities[match] || entity)

any reason you see that it's not working in the fiddle? https://jsfiddle.net/37gufeva/1/ — RooksStrife, Jun 29 '22 at 19:44
@RooksStrife: Strings are immutable, you must `return` the result of the call to `replace` (or overwrite `inputString`). — Luatic, Jun 29 '22 at 20:41

RegEx 5 Strings to Match - Replace Based on Match

1 Answers1