-2

When I use a regular expression like

std::regex midiNoteNameRegex("([cdefgab])([b#]{0,1})([0-9]))|([0-9]{3})|([A-Z0-9]{2})");

there are three top-level subexpressions connected by "|" in the pattern of which one will match. Is there a way to tell which one? Other than testing them sequentially one after the other?

If I would use named subexpressions it would be easy, but there are no named subexpressions in C++.

How do I solve this problem?

DrSvanHay
  • 1,170
  • 6
  • 16
  • You are using numbered capturing groups. So that seems to be trivial. – revo May 10 '19 at 19:22
  • 2
    Did you know about this error `([cdefgab])([b#]{0,1})([0-9])) <-- Unbalanced ')' |([0-9]{3})|([A-Z0-9]{2})` ? –  May 10 '19 at 19:25
  • @sln thanks! That is the solution! Because of the syntax error, the numbering of the subexpressions was broken, which I misinterpreted in a way that only the matched subexpressions get an index. – DrSvanHay May 10 '19 at 21:16

2 Answers2

2

Given the groups in your regex, it's just a flat search of the match object,
which in C++ is a flag (int) check, with no noticeable overhead.

    ( [cdefgab] )                 # (1)
    ( [b#]{0,1} )                 # (2)
    ( [0-9] )                     # (3)
 |  ( [0-9]{3} )                  # (4)
 |  ( [A-Z0-9]{2} )               # (5)

And a possible usage

wregex MyRx = wregex( "([cdefgab])([b#]{0,1})([0-9])|([0-9]{3})|([A-Z0-9]{2})", 0);

wstring::const_iterator start = str.begin();
wstring::const_iterator end   = str.end();
wsmatch m;

while ( regex_search( start, end, m, MyRx ) )
{
    if ( m[1].matched )       
        // First alternation
    else
    if ( m[4].matched )       
        // Second alternation
    else
    if ( m[5].matched )       
        // Third alternation
    start = m[0].second;
}
0

I don't have a definite answer but I believe the answer is most likely no.

Named capturing group is not a required feature: http://www.cplusplus.com/reference/regex/ECMAScript/

Implementation of named capturing group is probably not trivial and probably brings down the performance of the regex engine.

Found another post on this issue that agrees with me: C++ regex: Which group matched?

TimWeri
  • 25
  • 5