-1

I want to detect when the user enters "lw 2, 3(9)" , but it can't read the parenthesis, I used this code but it still doesn't detect the parenthesis.

    { R"((\w+) ([[:digit:]]+), ([[:digit:]]+) (\\([[:digit:]]+\\)) )"}

Can someone please help?

Frank C.
  • 7,758
  • 4
  • 35
  • 45
Malak Sadek
  • 35
  • 1
  • 7
  • 1
    There's an underlying assumption here that the assembly language you're processing is a regular language. Do you have a proof for that? If not, you may be using the wrong tool for the job, – user207421 Nov 18 '16 at 08:41
  • If this is supposed to be some semi-clever parser (like IDE highlight), you should go for `\s+` and/or `\s*` at places where user may enter some kind of space, for example `lw 6 ,5 ( 2 )`. (but that's still just hack-ish solution, full parser would be better) – Ped7g Nov 18 '16 at 10:31

3 Answers3

3

You need to be careful with excessive spaces in the pattern, and since you are using a raw string literal, you should not double escape special chars:

R"((\w+) ([[:digit:]]+), ([[:digit:]]+)(\([[:digit:]]+\)))"
                                      ^^^             ^ ^^

It might be a good idea to replace literal spaces with [[:space:]]+.

C++ demo printing lw 2, 3(9):

#include <iostream>
#include <regex>
#include <string>
using namespace std;

int main() {
    regex rx(R"((\w+) ([[:digit:]]+), ([[:digit:]]+)(\([[:digit:]]+\)))");
    string s("Text lw 2, 3(9) here");
    smatch m;
    if (regex_search(s, m, rx)) {
        std::cout << m[0] << std::endl;
    }
    return 0;
}
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0
R"((\w+) (\d+), (\d+)(\(\d+\)))"

worked for me

cokceken
  • 2,068
  • 11
  • 22
  • And why did you use `\d` inside the brackets? `\d` = `[[:digit:]]`. – Wiktor Stribiżew Nov 18 '16 at 08:38
  • For "any digit" as [0-9], it looked simpler – cokceken Nov 18 '16 at 08:39
  • 1
    Yes, `[[:digit:]]` = `\d`, you do not have to write `[\d]`. Actually, your answer is just the same as mine, but without any explanation. – Wiktor Stribiżew Nov 18 '16 at 08:39
  • Yes you are right :) i will edit my answer, thanks you. edit: my answer may help him/her simplify the string so i did not delete it after i saw your answer – cokceken Nov 18 '16 at 08:40
  • You but is equivalent to [[:digit:]]. \d captures all type of numerals. http://stackoverflow.com/questions/6479423/does-d-in-regex-mean-a-digit – cokceken Nov 18 '16 at 08:50
  • @LưuVĩnhPhúc: In ECMAScript regex standard (used by default with `std::regex`), `\d` = `[0-9]`. .NET is a different regex engine where all the shorthand character classes are Unicode aware. – Wiktor Stribiżew Nov 18 '16 at 09:21
0

Since you didn't specify whether you want to capture something or not, I'll provide both snippets.

You don't have to escape characters with raw string literals but you do have to escape capture groups

#include <iostream>
#include <string>
#include <regex>

int main()
{
    std::string str = "lw 2, 3(9)";

    {
        std::regex my_regex(R"(\w+ \d+, \d+\(\d+\))");
        if (std::regex_search(str, my_regex)) {
            std::cout << "Detected\n";
        }
    }

    {
        // With capture groups
        std::regex my_regex(R"((\w+) (\d+), (\d+)(\(\d+\)))");
        std::smatch match;
        if (std::regex_search(str, match, my_regex)) {
            std::cout << match[0] << std::endl;
        }
    }
}

Live example

An additional improvement could be to handle multiple spacing (if that is allowed in your particular case) with \s+.

I can't help but notice that EJP's concerns might also be spot-on: this is a very fragile solution parsing-wise.

Community
  • 1
  • 1
Marco A.
  • 43,032
  • 26
  • 132
  • 246