0

Say that I have a string which contains both multiple sets and nesting of parenthesis. I want to extract only the string in the first parenthesis encountered, including whatever nested parenthesis it contains.

For example:

this (is(maybe)) a test (and maybe not)

I want to extract:

is(maybe)

I believe this can be accomplished without the use of regexes, by which I can easily do it.

So my question is how can this be accomplished without regexes?

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 1
    Note that doing this *with* regexes (at least by the classic CS definition of "regular expression") is not feasible, as regular expressions (and the automatons they are implemented with) are memory-less and thus can't do things like balancing brackets... With some of the things generically called "regexes" today, it might be possible, but that's because they are somewhat more than the name they claim would indicate... – twalberg Mar 04 '15 at 19:48
  • @twalberg Well look at that O.O C++ regexes don't support look behind. Well I'll just throw my Perl solution right out the window. – Jonathan Mee Mar 04 '15 at 20:00
  • @JonathanMee well, in C++ it depends what library you use. Some even support several syntaxes. In general there are 5 or 6 flavors of regex. Ironically one you use in Perl or in Python are written either in C or C++ :P – Swift - Friday Pie Jun 10 '23 at 08:07

2 Answers2

5

Pseudo code:

 iterate over chars
 if char is (
   increment counter
   store position of first ( only
 if char is )
   decrement counter
   if counter == 0
      use index of current char and position of first ( to get substring
Sebastiaan M
  • 5,775
  • 2
  • 27
  • 28
4

Lest pseudo code be the only answer I've taken it upon myself to answer this using standard algorithms. Given const string foo{ "this (is(maybe)) a test (and maybe not)" } can be used to solve like this:

const auto start = find(cbegin(foo), cend(foo), '(');
const auto finish = find_if(start, cend(foo), [count = 0](const char i) mutable {
    if (i == '('){
        count++;
    }
    else if (i == ')'){
        count--;
    }
    return count <= 0; });

From here, if both start and finish are not cend(foo) the string is valid and can be obtained from string(next(start), finish) (Live Example).

It's possible that this is as good a solution as there is in C++. I guess it is just wishful thinking that there's something out there to match parentheses and find the value.

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288