-1

I want to be able to find this pattern inside a c++ string. The pattern is as follows:

FIXED_WORD ANY_WORD(...)

where FIXED_WORD refers to a fixed keyword and ANY_WORD can be any word as long as a bracket follows from it.

I have tried using RegEx such as keyword \b(.*)\b\((.\*)\), where I tried to use the word boundary \b(.*)\b to extract out ANY_WORD followed by a bracket:

std::string s = "abcdefg KEYWORD hello(123456)";
std::smatch match;
std::regex pattern("KEYWORD \b(.*)\b\((.*)\)");

if (std::regex_search(s, match, pattern))
{
    std::cout << "Match\n";

    for (auto m : match)
      std::cout  << m << '\n';
}
else {
    std::cout << "No match\n";
}

I am always getting a no match for this.

Toto
  • 89,455
  • 62
  • 89
  • 125
calveeen
  • 621
  • 2
  • 10
  • 28
  • regex should match 'hello' in your given example? or it should match that there is a word between 'KEYWORD' & brackets '()'? – moys Feb 21 '20 at 02:06
  • try `(?<=KEYWORD\s)(\w*)(?=\(.*\))` as the regex pattern. This will match 'hello' in the example you have provided. Demo--> https://regex101.com/r/wZ6PVa/1 – moys Feb 21 '20 at 02:13
  • i want to be able to extract out the entire regex pattern, so not just hello – calveeen Feb 21 '20 at 02:57
  • then you can use `(KEYWORD \w*\(.*\))` as regex – moys Feb 21 '20 at 02:59
  • if KEYWORD appears another time in the string elsewhere then this would not work ? Eg "abc KEYWORD xyz KEYWORD hello(4567)" – calveeen Feb 21 '20 at 03:03
  • it will work. it is looking for KEYWORD following by another alphanumeric word followed by content in brackets. If KEYWORD is somewhere else, it will not. use the demo link to try out your different scenarios. – moys Feb 21 '20 at 03:06

1 Answers1

0

You're forgetting that slashes are escaped when you use a string literal. Use a raw string e.g. R"(...)" to preserve the slashes

std::regex pattern(R"(KEYWORD \b(.*)\b\((.*)\))");

Then your pattern works as expected:

Match
KEYWORD hello(123456)
hello
123456

https://godbolt.org/z/dJaAAX

parktomatomi
  • 3,851
  • 1
  • 14
  • 18
  • hey thanks for the answer @parktomatomi. but if i am given a string that has "abc... KEYWORD... KEYWORD KEYWORD(..)" it is not able to extract it out ? In the sense if KEYWORD appears twice in the string it and also if it is the ANY_WORD also then it fails – calveeen Feb 21 '20 at 02:59
  • I tried your test sequence and it found the bit at the end which matches your pattern: https://godbolt.org/z/Aj7dS6 . Were you expecting a different output? – parktomatomi Feb 21 '20 at 03:07
  • hmm it seems weird it should work, however when i tried on this std::string s = "abc KEYWORD xyz KEYWORD KEYWORD(v, _)"; the first match was "KEYWORD xyz KEYWORD KEYWORD(v, _). I used the "..." to represent any other string before the pattern appeared sorry. – calveeen Feb 21 '20 at 03:12
  • That probably has something to do with `.*` being a greedy operator. I'm not smart enough to know the specifics. With patterns that have a sequence, I usually try to figure out the valid characters for that sequence, and detect a simple run of them, e.g. `\w+`. https://godbolt.org/z/6KW-o6 – parktomatomi Feb 21 '20 at 03:20
  • hey thanks @parktomatomi i managed to solve it. Sorry i have another question which is how do i extract out the ANY_WORD(...) without the keyword after finding the match ? :/ – calveeen Feb 21 '20 at 03:24
  • `match[N]` returns the Nth captured group (and `match[0]` is always the entire match). In your pattern, `match[1]` is ANY_WORD and `match[2]` is the stuff between the parenthesis. – parktomatomi Feb 21 '20 at 03:28
  • hey @parktomatomi sorry yet again.. but lets say that i want to match the last occurence of the closing bracket. I.e if i have "abc pattern xyz pattern pattern (v, ((abcde)))" , i want match[2] to be "v,((abcde))". Currently only the first closing bracket is being matched ? – calveeen Feb 21 '20 at 03:55
  • In that case _use_ the greedy operator `.+`. But honestly, if you're at the point where you're trying to balance parenthesis, you should give up on regular expressions and write a parser. pegtl is a good starter library for that. – parktomatomi Feb 21 '20 at 14:48