-2

In Perl, there's a function named quotemeta which accepts a string and returns a regex pattern that matches that string. It's used in virtually every program to avoid code injection bugs.

One would use quotemeta when dynamically building a pattern. For example,

"^"+quotemeta(var)+"_\\d+$"

A JavaScript implementation follows:

function quotemeta(s) {
   return String(s).replace(/\W/g, "\\$&");
}

Given how needed this function when working with regex patterns, I would have expected JavaScript to provide one. Does JavaScript or jQuery already have such a function?

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • @p.s.w.g, I already know it's possible to write a custom solution. My question is whether there already exists one in JS or jQuery. I looked, but I want to double check. Not a duplicate. – ikegami Jan 14 '14 at 20:29
  • As you say, Perl's quotemata is `[^\w]`. This is actually dangerous if the text is to be injected into a regex. Its better to escape the 12 meta non-class and 4 meta class characters instead, that is if its going to a part of a regex string. –  Jan 14 '14 at 20:30
  • @sln, backwards. Far safer to escape anything that might ever need escaping (whitelist) rather than figuring out exactly what needs to be escaped in every implementation (blacklist). – ikegami Jan 14 '14 at 20:32
  • Why do we need to double escape a string intended to be used as a RegEx? Just compile the expression from the string like: `var re = new RegExp('^'+string+'$', 'gi');` Unless I'm missing something... – tenub Jan 14 '14 at 20:34
  • @tenub, Say `string` is `a*b`, `new RegExp('^'+string+'$', 'gi')` matches `a*b`, which `new RegExp('^'+quotemeta(string)+'$', 'gi')` matches `aaaaaab`. Not the same. – ikegami Jan 14 '14 at 20:36
  • @tenub, its a literal he's injecting. But it can ONLY be a literall, the entire string including spaces control codes, everything. –  Jan 14 '14 at 20:42
  • @ikegami, `new RegExp('^'+string+'$', 'gi')` matches `aaaaaab` as well. – tenub Jan 14 '14 at 21:20
  • @tenub, I said it backwards. I meant: Say string is `a*b`, `new RegExp('^'+string+'$', 'gi')` matches `aaaaaaab`, while `new RegExp('^'+quotemeta(string)+'$', 'gi')` matches `a*b`. Not the same. – ikegami Jan 15 '14 at 05:38
  • I am still of the mind that escaping all non-word char's to make literals should be avoided in general. The notion of a `\W` literal is not correct in all engines. Posix for one, and hybrids, for example `\<` and `\>` are word boundries, not literals. But, JS does not have these problems I guess. –  Jan 15 '14 at 17:13
  • @sln, That's a different regex language and thus irrelevant. Even the one linked above wouldn't work for POSIX regex patterns. It's impossible to write a function that escapes for any language. In Perlish regex languages such as JS's, unescaped word characters match literally and escaped non-word characters match literally. Some unescaped non-word characters also match literally, but using a list of them is needlessly complicated and it's not forward-compatible. – ikegami Jan 15 '14 at 17:21

1 Answers1

3

JavaScript doesn't have such a method natively. (And jQuery doesn't include one)

Usually, when searching for a string pattenr, you'd use String.prototype.indexOf. This method find a string in a string, so you won't even need to convert the string pattern to a regex.

String.prototype.replace can also take a String pattern.

It is not exactly the same but it'll work for most string matching use cases.

Simon Boudrias
  • 42,953
  • 16
  • 99
  • 134