0

I have found this and this, that demonstrate how to escape reserved regex characters in string literals.

My Code

function escapeRegExp2(string) {
    return string.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}

const str0 = 'some me ome \*me \w*me mee';
const str1 = '\*me';
const cleanStr1 = escapeRegExp2(str1);
const regex = new RegExp(cleanStr1, 'i');
console.log(`str0.match(regex): ${str0.match(regex)}`);

If you look at the output, match() returns *me, as opposed to \*me. In other words it leaves out the backslash character, which was purposefully included.

How do I ensure all characters are included in match/search while still escaping malicious code?


Edit

After comments, I have made a simplified example:

const str0 = 'some me ome \\*me \\w*me mee';
const str1 = '\\*me';
console.log(`Match: ${str0.match(str1)}`);

Should it not look for the escaped \*me as declared in str1, in the escaped str0?

Magnus
  • 6,791
  • 8
  • 53
  • 84
  • Double ``\`` chars. `const str0 = 'some me ome \\*me \\w*me mee';` - this string will have two ``\`` chars in it. – Wiktor Stribiżew Apr 23 '18 at 20:11
  • @WiktorStribiżew Thank you, so it is `str0` that is the culprit, not `regex`. In that case, two follow-up questions if I may: 1) Why does `cleanStr1` (`\\*me`) match with `*me`? If I understand correctly, it should only match with `\\*me` (which escapes to `\*me`). My goal is to make any input (`str1`) match whatever is in `str0`, which comes from a DOM Text node. 2) Do i need to run `escapeRegExp2` on both the Text node's nodeValue (here `str0`) and `str1` before comparing, in order to make match work as intended? – Magnus Apr 23 '18 at 20:20
  • 1) `new RegExp("\\*me")` = `/\*me/`, and it cannot match a ``\*me`` string, it can only match `*me`. 2) no idea what you are asking. – Wiktor Stribiżew Apr 23 '18 at 20:24
  • @WiktorStribiżew I have added an edit, with simplified code highlighting the issue. It uses escapes with double backslash, as instructed in the post marked as duplicate. – Magnus Apr 23 '18 at 20:35
  • Yes, and it (`/\*me/`) matches `*me` (after `ome`) as expected. Do you want to match a backslash before `*`? Then use `/\\\*me/`. You may use `escapeRegExp2(str1)` and the whole part from your first snippet. – Wiktor Stribiżew Apr 23 '18 at 20:44
  • As far as your _second_ JS snippet goes. You're saying `str0` is the regex to use `'\\*me'`. After it is un-stringed, it becomes the raw regex `\*me`. This says match a _literal *_ followed by a _me_. So far so good. Now you want to use that regex on `str1` to see if it can match. And.. it does. So it prints out `*me` which is what it is supposed to do. _What is the confusion you're having ??_ –  Apr 23 '18 at 22:14
  • As far as your escape metacharacters function. It is good. However, I would not use the match variable `$&` as, if it's like Perl, it's problematic. Use a proper capture variable. Like this `replace(/([-\/\\^$*+?.()|[\]{}])/g, '\\$1');` And an fyi, there is no need for the `-` to be included as a metacharacter, because it isn't one, especially since `[]` are being escaped and there is no possibility `-` can be construed as a _class range operator_. –  Apr 23 '18 at 22:20

0 Answers0