1

Consider the code:

regex boundary{ "\\s*\\b\\s*" };
string test = "foo bar\t baz-floop";
auto begin = sregex_token_iterator(test.begin(), test.end(), boundary, -1);

for (auto i = begin; i != sregex_token_iterator{}; i++) {
    cout << *i << endl;
}

The code was adapted from other answer and was meant to split the string by regex. The result of calling this (on VC++ 16.2.3) are:

oo

ar

az

loop

How can I correct the code, so that the first letter of matches are not deleted? I can't change the regex itself. Moreover, the analogous code in Java seems to work according to my expactations:

    Pattern boundary = Pattern.compile("\\s*\\b\\s*");
    String test = "foo bar\t baz-floop";
    String[] results = boundary.split(test);
    for (String result : results) {
        System.out.println(result);
    }
lukeg
  • 4,189
  • 3
  • 19
  • 40
  • 2
    Buggy standard library? [Works here](https://ideone.com/TpDeiq). – n. m. could be an AI Sep 12 '19 at 14:47
  • "Otherwise (if the member regex_iterator is an end-of-sequence iterator), but the value -1 is one of the values in submatches/submatch, turns *this into a suffix iterator pointing at the range [a,b) (the entire string is the non-matched suffix)" from https://en.cppreference.com/w/cpp/regex/regex_token_iterator/regex_token_iterator. It seems the -1 might be the culprit – LogicalKip Sep 13 '19 at 09:04

1 Answers1

0

It was a bug in the standard library. Fixed here.

lukeg
  • 4,189
  • 3
  • 19
  • 40