1

I don't understand how to retrieve all groups using regexp in c++ An example:

const std::string s = "1,2,3,5";
std::regex lrx("^(\\d+)(,(\\d+))*$");
std::smatch match;
if (std::regex_search(s, match, lrx))
{
    int i = 0;
    for (auto m : match)
        std::cout << "  submatch " << i++ <<  ": "<< m << std::endl;
}

Gives me the result

  submatch 0: 1,2,3,5
  submatch 1: 1
  submatch 2: ,5
  submatch 3: 5

I am missing 2 and 3

Troels Blum
  • 823
  • 8
  • 17
  • What is your goal? To validate a string + extract digit chunks? – Wiktor Stribiżew Sep 18 '17 at 19:50
  • Regex, by default, will only capture the **last** match that is made for each matching group. If you want to capture each digit/number, you need to recursively call your regex function and strip the last match from the string so that, for example, `1,2,3,5` matches `5` and your string is now `1,2,3`. Then `1,2,3` matches `3` and your string is now `1,2`. Then `1,2` matches `2` and your string is now `1`. Then `1` matches `1` and your string is empty - end of recursion. Then again, if that's what you're doing to get your answer, regex is likely not the best tool for the job! – ctwheels Sep 18 '17 at 19:55
  • To continue off my last comment, the better solution would be to use something like [Parsing a comma-delimited std::string](https://stackoverflow.com/questions/1894886/parsing-a-comma-delimited-stdstring) – ctwheels Sep 18 '17 at 19:57
  • Well, right now, it is easy to split this with a comma if no validation is necessary. Beside the details shared by ctwheels, mind that `std::regex_search` only returns one match. You need multiple matches. And since you defined 3 capturing groups in the pattern, you have 3+1 groups in the output. – Wiktor Stribiżew Sep 18 '17 at 19:58
  • See https://ideone.com/dlIIFN, enough to match all numbers in your sample string. – Wiktor Stribiżew Sep 18 '17 at 22:00
  • Yes the goal is validation and extraction. @WiktorStribiżew While your example works for valid input, it also works for invalid input. So the short question does validation and extraction need to be two different steps when using regex in c++11? – Troels Blum Sep 19 '17 at 10:14
  • 1
    Ok, then you have to do it in 2 steps. – Wiktor Stribiżew Sep 19 '17 at 10:16

1 Answers1

1

You cannot use the current approach, since std::regex does not allow storing of the captured values in memory, each time a part of the string is captured, the former value in the group is re-written with the new one, and only the last value captured is available after a match is found and returned. And since you defined 3 capturing groups in the pattern, you have 3+1 groups in the output. Mind also, that std::regex_search only returns one match, while you will need multiple matches here.

So, what you may do is to perform 2 steps: 1) validate the string using the pattern you have (no capturing is necessary here), 2) extract the digits (or split with a comma, that depends on the requirements).

A C++ demo:

#include <string>
#include <iostream>
#include <regex>
using namespace std;

int main() {
    std::regex rx_extract("[0-9]+");
    std::regex rx_validate(R"(^\d+(?:,\d+)*$)");
    std::string s = "1,2,3,5";
    if (regex_match(s, rx_validate)) {
        for(std::sregex_iterator i = std::sregex_iterator(s.begin(), s.end(), rx_extract);
                                 i != std::sregex_iterator();
                                 ++i)
        {
            std::smatch m = *i;
            std::cout << m.str() << '\n';
        }
    }
    return 0;
}

Output:

1
2
3
5
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563