1

The Context

I'm in need of a bit of code that takes a very simple math string and runs PHP's eval() function. For example ...

  $math = '25 * (233 - 1.5)';
  echo eval("return $math;"); // returns 5787.5

However eval() is quite dangerous in the wrong hands, so the variable must be scrubbed. For the above, for example, a simple preg_replace would be ...

  $math = '25 * (233 - 1.5)';
  $replace = '/[^0-9\(\)\.\,\+\-\*\/\s]/';
  $math = preg_replace($replace, '', $math);
  echo eval("return $math;"); // returns 5787.5

... which ensures $math only contains valid characters ... .,+-*/, spaces and numbers, and no malicious code.

The Question

I want to allow a few very specific words (PHP math functions), such as pow, pi, min, max, etc.

What's the cleanest way to validate both characters and words in regex?

So if given this string ...

pow(25,2) / pi(); hack the pentagon;

... how would I remove everything that wasn't in the $replace regex, but preserve the words pow and pi?

designosis
  • 5,182
  • 1
  • 38
  • 57
  • I was thinking an `explode` and re-`implode` might do it, but it's not the most elegant solution :\ – designosis Feb 17 '21 at 02:33
  • 1
    I wonder if it would actually be easier to write a reverse-Polish parser to do this, and thus avoid the perils of `eval` altogether. – Tangentially Perpendicular Feb 17 '21 at 03:10
  • 1
    After a quick trawl and a little Google-foo I found a couple of PHP calculators that might form the basis of what you want. Take a look at [this](https://github.com/Sjord/calculator/blob/master/calculator.php), for example. – Tangentially Perpendicular Feb 17 '21 at 03:20
  • Interesting ... but how would I run the `pow()` or `pi()` functions? EVAL is so very close, it just needs a bit of filtering. Also I'm reluctant to add that many lines of code for such a small feature lol – designosis Feb 17 '21 at 03:24
  • Redacting my last statement ... you lead me down the right track, as https://github.com/chriskonnertz/string-calc might be just what I need – designosis Feb 17 '21 at 03:28
  • Yep, that library works perfectly :) I really could delete this question now, but I'd love to know how to do this with regex, so I'll leave it up for a bit to learn from the masters :) – designosis Feb 17 '21 at 03:32

1 Answers1

2

Using php, you can match those words that you don't want to remove and use a (*SKIP)(*FAIL) approach.

You can also shorten the character class by remove the backslashes, and if you use a different delimiter than / in php you also don't have to escape the /

As you are replacing the matched characters in the character class with an empty string, you can use a quantifier + to match 1 or more consecutive matches and do a single replacement.

\b(?:p(?:i|ow)|m(?:in|ax))\b(*SKIP)(*FAIL)|[^0-9().,+*/\s-]+

The pattern matches

  • \b(?:p(?:i|ow)|m(?:in|ax))\b Match either pi pow min or max
  • (*SKIP)(*FAIL)| What is matches so far should not be part of the match result
  • [^0-9().,+*/\s-]+ Match 1+ times any char except the listed chars in the negated character class

Regex demo

If you don't want the spaces at the start and end, you could consider to trim $math

$math = 'pow(25,2) / pi(); hack the pentagon;';
$replace = '~\b(?:p(?:i|ow)|m(?:in|ax))\b(*SKIP)(*FAIL)|[^0-9().,+*/\s-]+~';
$math = preg_replace($replace, '', $math);
echo eval("return $math;"); // returns 198.94367886487
The fourth bird
  • 154,723
  • 16
  • 55
  • 70