410

I am designing a regular expression tester in HTML and JavaScript. The user will enter a regex, a string, and choose the function they want to test with (e.g. search, match, replace, etc.) via radio button and the program will display the results when that function is run with the specified arguments. Naturally there will be extra text boxes for the extra arguments to replace and such.

My problem is getting the string from the user and turning it into a regular expression. If I say that they don't need to have //'s around the regex they enter, then they can't set flags, like g and i. So they have to have the //'s around the expression, but how can I convert that string to a regex? It can't be a literal since its a string, and I can't pass it to the RegExp constructor since its not a string without the //'s. Is there any other way to make a user input string into a regex? Will I have to parse the string and flags of the regex with the //'s then construct it another way? Should I have them enter a string, and then enter the flags separately?

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Gordon Gustafson
  • 40,133
  • 25
  • 115
  • 157

15 Answers15

714

Use the RegExp object constructor to create a regular expression from a string:

var re = new RegExp("a|b", "i");
// same as
var re = /a|b/i;
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • 1
    would be nice to have online tool with a input field – holms Nov 14 '13 at 04:10
  • 76
    When doing it this way, you must escape the backslash, e.g. `var re = new RegExp("\\w+");` – JD Smith Sep 12 '14 at 15:59
  • 16
    @holms [regex101.com](https://regex101.com/) is a great regex online tool as well – Fran Herrero Jul 18 '16 at 08:38
  • 3
    It took me a while to see that there are no trailing slashes required – ESP32 Dec 07 '16 at 23:52
  • @JD Smith and also escape the double quotes! – Luis Paulo Apr 03 '18 at 00:48
  • Actually @LuisPaulo you can't escape the double quotes surrounding the string arguments to the RegExp constructor method; you'd only escape double quotes that were in the string itself. – JD Smith Apr 10 '18 at 17:12
  • 2
    @JDSmith I didn't mean it in your example. I meant that you need to escape double quotes if you want them to be a part of the regex provided it is hard coded. Obviously, none of this applies if the string is in a variable like from an `` HTML tag. `var re = new RegExp("\"\\w+\"");` is an example of a hard coded regex using the RegExp constructor and the escaping of the double quotes __is__ necessary. What I mean by a string in a variable is that you can just do `var re = new RegExp(str);` and `str` may contain double quotes or backslashes without a problem. – Luis Paulo Apr 17 '18 at 00:23
  • @JDSmith I didn't mean the quotes in your example :) I was just completing your sentence "When doing it this way, you must escape the backslash" with "and double quotes". – Luis Paulo Apr 17 '18 at 00:24
  • Are those two really the same? If it's a string (object constructor), isn't it compiled each time the regex is run [(MSDN source)](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#Description)? Wouldn't it be better to construct a literal notation if it's run in a loop? – smurtagh Sep 07 '18 at 14:51
  • Also, I'm replacing \ with \\ in input string variable in a following way: `var r = new RegExp(str.split('\\').join('\\\\'));`. Notice, that I'm escaping slashes in string literal. – AntonAL Jun 09 '20 at 17:04
  • this doesn't answer the original question though? What about the ability for user input to specify `/i` at the end – Garr Godfrey Mar 02 '23 at 19:05
82
var flags = inputstring.replace(/.*\/([gimy]*)$/, '$1');
var pattern = inputstring.replace(new RegExp('^/(.*?)/'+flags+'$'), '$1');
var regex = new RegExp(pattern, flags);

or

var match = inputstring.match(new RegExp('^/(.*?)/([gimy]*)$'));
// sanity check here
var regex = new RegExp(match[1], match[2]);
Mak
  • 19,913
  • 5
  • 26
  • 32
Anonymous
  • 49,213
  • 1
  • 25
  • 19
  • You should consider that an invalid input like `/\/` is recognized. – Gumbo May 17 '09 at 15:14
  • 12
    Or let the RegExp constructor fail, "trailing \ in regular expression", instead of writing a complicated parser. – Anonymous May 17 '09 at 15:23
  • **Note** that users can input as many flags as they want, e.g.: `/foo/ggggg`. In the first example you could change the `flags` replace to `replace('/.*\/(?!.*(.).*\1)([gimy]*)$/', '$2')`. Or use the following regex for the 2nd example `^\/(.*)\/(?!.*(.).*\2)([gimy]*)$`, what will put the flags in match group 3. – luukvhoudt May 14 '21 at 14:33
35

Here is a one-liner: str.replace(/[|\\{}()[\]^$+*?.]/g, '\\$&')

I got it from the escape-string-regexp NPM module.

Trying it out:

escapeStringRegExp.matchOperatorsRe = /[|\\{}()[\]^$+*?.]/g;
function escapeStringRegExp(str) {
    return str.replace(escapeStringRegExp.matchOperatorsRe, '\\$&');
}

console.log(new RegExp(escapeStringRegExp('example.com')));
// => /example\.com/

Using tagged template literals with flags support:

function str2reg(flags = 'u') {
    return (...args) => new RegExp(escapeStringRegExp(evalTemplate(...args))
        , flags)
}

function evalTemplate(strings, ...values) {
    let i = 0
    return strings.reduce((str, string) => `${str}${string}${
        i < values.length ? values[i++] : ''}`, '')
}

console.log(str2reg()`example.com`)
// => /example\.com/u
Rivenfall
  • 1,189
  • 10
  • 15
  • 2
    this seems to be the opposite of what the question asks. It wants to treat a string like a regex expression, so user can give input like `/.*\.com$/` and have it match `example.com` – Garr Godfrey Mar 02 '23 at 19:01
  • right, in this case, based on the end of the question "Should I have them enter a string, and then enter the flags separately?" the response could be to just use the RegExp constructor, or extract the flags like in https://stackoverflow.com/a/874742/2234156 – Rivenfall Mar 07 '23 at 09:53
19

Use the JavaScript RegExp object constructor.

var re = new RegExp("\\w+");
re.test("hello");

You can pass flags as a second string argument to the constructor. See the documentation for details.

Ayman Hourieh
  • 132,184
  • 23
  • 144
  • 116
13

In my case the user input somethimes was sorrounded by delimiters and sometimes not. therefore I added another case..

var regParts = inputstring.match(/^\/(.*?)\/([gim]*)$/);
if (regParts) {
    // the parsed pattern had delimiters and modifiers. handle them. 
    var regexp = new RegExp(regParts[1], regParts[2]);
} else {
    // we got pattern string without delimiters
    var regexp = new RegExp(inputstring);
}
staabm
  • 1,535
  • 22
  • 20
  • 4
    you could always use the `.split()` function instead of a long regex string. `regParts = inputstring.split('/')` this would make `regParts[1]` the regex string, and `regParts[2]` the delimiters (assuming the setup of the regex is `/.../gim`). You could check if there are delimiters with `regParts[2].length < 0`. – ZomoXYZ Apr 21 '16 at 17:17
  • 1
    @ZomoXYZ Don't use split, it won't handle escaped `/` in the regex – Tofandel Mar 22 '21 at 17:38
  • You can do even better: `function stringToRegex(s, m) { return (m = s.match(/^(.)(.*?)\1([gimsuy]*)$/)) ? new RegExp(m[2], m[3]) : new RegExp(s); }` – Tofandel Mar 22 '21 at 18:10
12

Try using the following function:

const stringToRegex = str => {
    // Main regex
    const main = str.match(/\/(.+)\/.*/)[1]
    
    // Regex options
    const options = str.match(/\/.+\/(.*)/)[1]
    
    // Compiled regex
    return new RegExp(main, options)
}

You can use it like so:

"abc".match(stringToRegex("/a/g"))
//=> ["a"]
Richie Bendall
  • 7,738
  • 4
  • 38
  • 58
6

Here is my one liner function that handles custom delimiters and invalid flags

// One liner
var stringToRegex = (s, m) => (m = s.match(/^([\/~@;%#'])(.*?)\1([gimsuy]*)$/)) ? new RegExp(m[2], m[3].split('').filter((i, p, s) => s.indexOf(i) === p).join('')) : new RegExp(s);

// Readable version
function stringToRegex(str) {
  const match = str.match(/^([\/~@;%#'])(.*?)\1([gimsuy]*)$/);
  return match ? 
    new RegExp(
      match[2],
      match[3]
        // Filter redundant flags, to avoid exceptions
        .split('')
        .filter((char, pos, flagArr) => flagArr.indexOf(char) === pos)
        .join('')
    ) 
    : new RegExp(str);
}

console.log(stringToRegex('/(foo)?\/bar/i'));
console.log(stringToRegex('#(foo)?\/bar##gi')); //Custom delimiters
console.log(stringToRegex('#(foo)?\/bar##gig')); //Duplicate flags are filtered out
console.log(stringToRegex('/(foo)?\/bar')); // Treated as string
console.log(stringToRegex('gig')); // Treated as string
Tofandel
  • 3,006
  • 1
  • 29
  • 48
  • 5
    Just because you crammed 10 statements into one line, it's not a one-liner :) – raveren Apr 30 '21 at 08:48
  • 2
    10 statements? It's just a ternary expression, and if it's one line, it's a one liner ;) And if you say it's 3 line because of the function that's just for readability otherwise you can do `const stringToRegex = (s, m) => (m = s.match(/^([\/~@;%#'])(.*?)\1([gimsuy]*)$/)) ? new RegExp(m[2], m[3].split('').filter((i, p, s) => s.indexOf(i) === p).join('')) : new RegExp(s);` – Tofandel Apr 30 '21 at 14:29
  • Readable and RegExp? LOL – TomeeNS Aug 13 '22 at 06:08
  • Nice. this seems to be the only useful answer to the question. – Garr Godfrey Mar 02 '23 at 19:03
3

I suggest you also add separate checkboxes or a textfield for the special flags. That way it is clear that the user does not need to add any //'s. In the case of a replace, provide two textfields. This will make your life a lot easier.

Why? Because otherwise some users will add //'s while other will not. And some will make a syntax error. Then, after you stripped the //'s, you may end up with a syntactically valid regex that is nothing like what the user intended, leading to strange behaviour (from the user's perspective).

Zombo
  • 1
  • 62
  • 391
  • 407
Stephan202
  • 59,965
  • 13
  • 127
  • 133
2

This will work also when the string is invalid or does not contain flags etc:

function regExpFromString(q) {
  let flags = q.replace(/.*\/([gimuy]*)$/, '$1');
  if (flags === q) flags = '';
  let pattern = (flags ? q.replace(new RegExp('^/(.*?)/' + flags + '$'), '$1') : q);
  try { return new RegExp(pattern, flags); } catch (e) { return null; }
}

console.log(regExpFromString('\\bword\\b'));
console.log(regExpFromString('\/\\bword\\b\/gi'));
            
kofifus
  • 17,260
  • 17
  • 99
  • 173
1

Thanks to earlier answers, this blocks serves well as a general purpose solution for applying a configurable string into a RegEx .. for filtering text:

var permittedChars = '^a-z0-9 _,.?!@+<>';
permittedChars = '[' + permittedChars + ']';

var flags = 'gi';
var strFilterRegEx = new RegExp(permittedChars, flags);

log.debug ('strFilterRegEx: ' + strFilterRegEx);

strVal = strVal.replace(strFilterRegEx, '');
// this replaces hard code solt:
// strVal = strVal.replace(/[^a-z0-9 _,.?!@+]/ig, '');
Gene Bo
  • 11,284
  • 8
  • 90
  • 137
1

You can ask for flags using checkboxes then do something like this:

var userInput = formInput;
var flags = '';
if(formGlobalCheckboxChecked) flags += 'g';
if(formCaseICheckboxChecked) flags += 'i';
var reg = new RegExp(userInput, flags);
Akshat Mahajan
  • 9,543
  • 4
  • 35
  • 44
Pim Jager
  • 31,965
  • 17
  • 72
  • 98
0

Safer, but not safe. (A version of Function that didn't have access to any other context would be good.)

const regexp = Function('return ' + string)()
Stephen Todd
  • 365
  • 3
  • 12
0

I found @Richie Bendall solution very clean. I added few small modifications because it falls appart and throws error (maybe that's what you want) when passing non regex strings.

const stringToRegex = (str) => {
const re = /\/(.+)\/([gim]?)/
const match = str.match(re);
if (match) {
    return new RegExp(match[1], match[2])
}

}

Using [gim]? in the pattern will ignore any match[2] value if it's invalid. You can omit the [gim]? pattern if you want an error to be thrown if the regex options is invalid.

Niv
  • 523
  • 1
  • 8
  • 19
0

Here is a runnnable snippet with input field that converts the input to regex:

  • If the user did not properly delimit with /, no flags are assumed
  • If the pattern includes a /, and thus escaped like \/, then it is retained and not mistaken for a / delimiter.

function toRegExp(s) {
    const [, ...parts] = s.match(/^\/((?:\\.|[^\\])*)\/(.*)$/) ?? [, s];
    try {
        return RegExp(...parts);
    } catch (e) {
        return e; // Could for instance be an error about invalid flags
    }
}

const [input, output] = document.querySelectorAll("input, span");
input.addEventListener("input", refresh);
refresh()

function refresh() {
    const regex = toRegExp(input.value);
    output.textContent = regex;
}
Regex:<br>
<input value="/test/gi"><p>
RegExp object back to string:<br>
<span></span>
trincot
  • 317,000
  • 35
  • 244
  • 286
-6

I use eval to solve this problem.

For example:

    function regex_exec() {

        // Important! Like @Samuel Faure mentioned, Eval on user input is a crazy security risk, so before use this method, please take care of the security risk. 
        var regex = $("#regex").val();

        // eval()
        var patt = eval(userInput);

        $("#result").val(patt.exec($("#textContent").val()));
    }
Playhi
  • 137
  • 1
  • 4
  • 7
    eval on userInput is a crazy security risk – Samuel-Zacharie Faure Jul 22 '19 at 14:53
  • 3
    mr bobby tables ! – Luiz Felipe Mar 04 '20 at 14:44
  • @SamuelFaure is it always though? If this application is a JavaScript web app, then the client has full access to the environment via the console. If it's code running in a server side environment obviously eval is a nitemare. I just struggle to see why eval is bad by default in a client side application running in a browser. – Peter Avram Jan 14 '23 at 03:34