0

I have a function which can take a string, interpret it as a calculation and return the result of the calculation, examples for valid calculations are:

3(log(e+pi^(22/3)))
44+3*(pi+2)/root(7)

the function is quite heavy so I would like to only run it if the String actually is a calculation. Before I added functions like log and root, pi and e and implicit multiplycation i used the following regex:

/^((-?([0-9]+?\.)?[0-9]+?)\s?([+\-*\/%^]|(\*\*))\s?)+?(-?([0-9]+?\.)?[0-9]+?)$/

which doesn't work anymore. At this point I am not even sure if a regex would make sense performance wise. I expect about 0.1% of strings to match (being a valid calculation).

Do you have any ideas on how to create a well performing regular expression (the function itself determines weather its a valid calculation itself, but it takes a long time, so no 100% accuracy needed) or a function which validates the calculation?

Emma
  • 27,428
  • 11
  • 44
  • 69
Teiem
  • 1,329
  • 15
  • 32

2 Answers2

1

The question you are asking is in essence Regular Expression AND String Parsing. IMHO, your string calculation can be built as a syntax tree. It would be easier to build a parser for it than to create a rather complicated regex.

Peipei
  • 136
  • 1
  • 9
  • As your link shows, it is impossible to match arbitrarily nested parentheses with regular expressions. – msw May 01 '19 at 19:56
  • @msw Well, the point is that I do not think one complicated regex is enough to accomplish this task. But there is still some room for using regex. For example, if some patterns of calculation expression are very frequent in your database, you combine multiple regular expressions together and use them to serve part of your parser. Check this [https://github.com/mozilla/treeherder/pull/181], it replaced some functions of the parser with several regexes for known patterns. – Peipei May 02 '19 at 02:32
0

I have written a function which verifies the calculation, here is the code:

    const isValidCalc = (calc) => {
    contains = {
        br: false,
        num: false,
        let: false,
        op: false,
    }
    let prev;
    let level = 0;

    return ![...calc.replace(/\*\*/g, "^").replace(/ /g, "").replace(/e/g, Math.E).replace(/pi/g, Math.PI)].some(el => {
        if (el === "(") {
            prev = "open";
            level++;
            return false;
        };
        if (el === ")") {
            if (level-- === 0 || prev === "letter") return true;

            prev = "close";
            contains.br = true;
            return false;
        }
        if (_.is.Operation(el)) {
            if (prev === "operator" || prev === "letter") return true;

            prev = "operator";
            contains.op = true;
            return false;
        }
        if (_.is.Numeric(el) || el === ".") {
            if (prev === "close" || prev === "letter") return true;

            prev = "numeric"
            contains.num = true;
            return false;
        }
        if (_.is.Letter(el)) {
            prev = "letter" 
            contains.let = true;
            return false;
        }

        return true;
    }) && level === 0 && contains.num && (!contains.let || contains.br) && (contains.let || contains.op);
};
Teiem
  • 1,329
  • 15
  • 32