0

We have regex library in C++. By using it, I want to parse tokenize the following mathematical expression.

(bar+3)*foo/3+-1

as

(
bar
+
3
)
*
foo
/
3
+
-1

To do it, I tried that one but it gives no output contrary to expected, not tokenize

std::string s ("(bar+3)*foo/3+-1");
std::smatch m;
std::regex e ("^[-+(]*[[:digit:]]+[)]*([-+*/][-+(]*[[:digit:]]+[)]*)*$");

How can it be done?

Edit: Sorry for miswriting.

  • Define *"does not work"* - what happens? What errors do you get, what do you see when debugging, etc.? – UnholySheep Mar 04 '18 at 11:52
  • @snr might help to show the actual code where you use the regex – john Mar 04 '18 at 11:54
  • 2
    The regex above is clearly wrong since it does not match any alphabetic characters, but your target string includes alphabetic characters. – john Mar 04 '18 at 11:55
  • Have you tried any one of the several online regex testers? Just Google it. Handy for sorting through your regex. – lurker Mar 04 '18 at 11:56
  • 5
    Regexes are incapable of parsing recursive languages. Mathematical expressions are recursive languages. So I think you've probably picked the wrong tool for the job. – john Mar 04 '18 at 11:57
  • what about the solution, https://stackoverflow.com/questions/34577678/simple-mathematical-expression-parsing , @john – Soner from The Ottoman Empire Mar 04 '18 at 12:02
  • @snr I can use a screwdriver as a hammer ... and it might work for the specific case that I put it to; but it won't work in the general case where a hammer is required. You have to decide if you're looking for something to work for the general case; or a very specific one where 'it'll do' – UKMonkey Mar 04 '18 at 12:04
  • umm, what a way would you propose me to achieve the parsing? @UKMonkey – Soner from The Ottoman Empire Mar 04 '18 at 12:05
  • @snr I'm sure you can google just as well as me. https://archive.codeplex.com/?p=fastmathparser – UKMonkey Mar 04 '18 at 12:06
  • @snr That solution is tokenising an expression, that's a first step to parsing. If that's all you want to do then use a regex. But to evaluate a mathematical expression you need to parse it. – john Mar 04 '18 at 12:07
  • Yes, that is all I want, I will not evaluate it. @john – Soner from The Ottoman Empire Mar 04 '18 at 12:09
  • `>>what about the solution...` that solution doesn't work for your example, why do you even bring it up? – Killzone Kid Mar 04 '18 at 12:20
  • 1
    @snr Well the difficult part is that you want `-` to be part of the integer, but you also presumably want it to be an operator as well `-1+-x`, the first minus is part of the integer but the second is a unary minus. If you are prepared to drop that requirement (and always treat `-` as a operator) it will be easier. – john Mar 04 '18 at 12:21

1 Answers1

0

This code tokenizes a mathematical expression

int main()
{
    string s = "(bar+3)*foo/3+-1";
    regex re("[[:digit:]]+|[[:alpha:]][[:alnum:]]*|[-+*/()]");
    auto tokens_begin = 
        std::sregex_iterator(s.begin(), s.end(), re);
    auto tokens_end = std::sregex_iterator();
    while (tokens_begin != tokens_end)
    {
        cout << tokens_begin->str() << endl;
        ++tokens_begin;
    }
}

Output

(
bar
+
3
)
*
foo
/
3
+
-
1

In this code a - is always treated as an operator so -1 is treated as a unary minus followed by an unsigned number. Probably not possible to do any better than this without doing some real parsing.

john
  • 85,011
  • 4
  • 57
  • 81
  • Really thank you for your help. My last point about it is what about the numbers are double like 3.412 by adding `-?[0-9]+([.][0-9]+)?` ?? – Soner from The Ottoman Empire Mar 05 '18 at 14:00
  • You'll have problems starting your regex with '-?' as I explained in the comments above about unary minus. Think about tokenising x-1.0, you probably don't want the minus to be part of your number in that case but with that regex it will be. Simplest thing is to treat all numbers as unsigned, – john Mar 05 '18 at 18:17