0

I'm replacing a sub-string using replace function and regex expression. However after character escape and replacement, I still have an extra '/' character. I'm not really familiar with regex can someone guide me.

I have implemented the escape character function found here: Is there a RegExp.escape function in Javascript?

RegExp.escape= function(s) {
    return s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
};
const latexConversions = [
    ["\\cdot", "*"],
    ["\\right\)", ")"],
    ["\\left\(", "("],
    ["\\pi", "pi"],
    ["\\ln\((.*?)\)", "log($1)"],
    ["stdev\((.*?)\)", "std($1)"],
    ["stdevp\((.*?)\)", "std(\[$1\], \"uncorrected\")"],
    ["mean\((.*?)\)", "mean($1)"],
    ["\\sqrt\((.*?)\)", "sqrt($1)"],
    ["\\log\((.*?)\)", "log10($1)"],
    ["\(e\)", "e"],
    ["\\exp\((.*?)\)", "exp($1)"],
    ["round\((.*?)\)", "round($1)"],
    ["npr\((.*?),(.*?)\)", "($1!/($1-$2)!)"],
    ["ncr\((.*?),(.*?)\)", "($1!/($2!($1-$2)!))"],
    ["\\left\|", "abs("],
    ["\\right\|", ")"],
];

RegExp.escape = function (s) {
    var t = s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
    return t;
};

mathematicalExpression = "\\sqrt( )"

//Problem is here
mathematicalExpression = mathematicalExpression.replace(new RegExp(RegExp.escape(latexConversions[8][0]), 'g'), latexConversions[8][1]);

//Works
mathematicalExpression2 = mathematicalExpression.replace(/\\sqrt\((.*?)\)/g, "sqrt($1)"); 

alert("what I got: "+mathematicalExpression); // "\sqrt()"
alert("Supposed to be: "+ mathematicalExpression2); // "sqtr()"

I have a live example here: https://jsfiddle.net/nky342h5/2/

  • Corrected the typo and the error persists! – Irvin Tinoco Lopez Jul 05 '19 at 18:17
  • But there cannot be a match, because the two parentheses in `(.*?)` are also escaped by `RegExp.escape`. Just output the result of the call of `RegExp.escape` and you'll see what happened. Just debug. Also, note how one backslash before a `(` in your source string is useless, it just escapes the parenthesis that did not need escaping in the first place. – trincot Jul 05 '19 at 18:33

2 Answers2

0

There are several misconceptions regarding the string literal "\\sqrt\((.*?)\)":

  1. This string in raw characters is: \sqrt((.*?)). Note how there is no difference between the two opening parentheses: the backslash in the string literal was not very useful. In other words, "\(" === "("
  2. Both opening parentheses will be escaped by RegExp.escape
  3. Points 1 and 2 are equally true for the closing parentheses, for the dot, the asterisk and the question mark: they will be escaped by RegExp.escape.

In short, you have no way to distinguish that a character is intended as a literal or as a regex special symbol -- you are escaping all of them as if they were intended as literal characters.

The solution:

Since you already are encoding regex specific syntax in your strings (like (.*?)), you might as well use regex literals instead of string literals.

In the case you highlighted, instead of this:

["\\sqrt\((.*?)\)", "sqrt($1)"]

...use this:

[/\\sqrt\((.*?)\)/g, "sqrt($1)"]

And let your code do:

mathematicalExpression = mathematicalExpression.replace(...latexConversions[8]);

Alternative

If for some reason regex literals are a no-go, then define your own special syntax for (.*?). For instance, use the symbol µ to denote that particular regex syntax.

Then your array pair would look like this:

["\\sqrt(µ)", "sqrt($1)"],

...and code:

mathematicalExpression = mathematicalExpression.replace(
    new RegExp(RegExp.escape(latexConversions[8][0]).replace(/µ/g, '(.*?)'), 'g'), 
    latexConversions[8][1]
);

Note how here the (.*?) is introduced in the string after RegExp.escape has done its job.

trincot
  • 317,000
  • 35
  • 244
  • 286
-1

extra \ rather than escaping everything

replace     ["\\sqrt\((.*?)\)", "sqrt($1)"],  with     ["\\\\sqrt\((.*?)\)", "sqrt($1)"],

and replace the final replace with 
mathematicalExpression = mathematicalExpression.replace(new RegExp((latexConversions1[8][0]), 'g'), latexConversions1[8][1]);