1

This is not a duplicate of this or this question, since I am using the newest g++ 6.1.

Here is a simple example I am trying:

int main() {
   std::string data = "a,b,c,d,e,f,g";
   std::smatch m;
   regex_search(data, m, std::regex("(\\w)"));
   std::cout << m.size() << std::endl;
   for (auto i = 0U; i != m.size(); i++)
       std::cout << m.position(i) << " " << m[i].str() << std::endl;
   return 0;
}

This example outputs 2 as the number of matches, while I would expect 7, since each letter in data should match \w. How do I fix this?

Also, both matches point to a at the beginning of the string.

Community
  • 1
  • 1
AlwaysLearning
  • 7,257
  • 4
  • 33
  • 68
  • how about here: http://stackoverflow.com/questions/21667295/how-to-match-multiple-results-using-stdregex – kmdreko May 16 '16 at 15:45
  • It does not provide a solution for when the number of matches is not known in advance and I want to know the positions of the matches. – AlwaysLearning May 16 '16 at 15:49
  • If you only look at the first half of the first answer then yes, but other answers there are more generic. Particularly St0fF's answer – kmdreko May 16 '16 at 15:52
  • You are right. The last reply there does it nicely: http://stackoverflow.com/a/35026140/2725810 – AlwaysLearning May 16 '16 at 16:06

2 Answers2

3

regex_seach doesn't provide any facility to scan a whole string, it just stops at first match. Luckily <regex> library provided a std::regex_iterator which does the job:

int main() {
   std::string data = "a,b,c,d,e,f,g";
   std::regex exp =  std::regex("(\\w)");

   auto mbegin = std::sregex_iterator(data.begin(), data.end(), exp);
   auto mend = std::sregex_iterator();

   for (auto it = mbegin; it != mend; ++it)
     cout << it->str() << endl;

   return 0;
}

The only caveat is that the lifetime of the std::regex used must match (at least) the one of the iterator, since std::regex_iterator stores a pointer to it internally.

Jack
  • 131,802
  • 30
  • 241
  • 343
2

Here is an excerpt from Finding All Regex Matches at regular-expressions.info:

Construct one object by calling the constructor with three parameters: a string iterator indicating the starting position of the search, a string iterator indicating the ending position of the search, and the regex object. If there are any matches to be found, the object will hold the first match when constructed. Construct another iterator object using the default constructor to get an end-of-sequence iterator. You can compare the first object to the second to determine whether there are any further matches. As long as the first object is not equal to the second, you can dereference the first object to get a match_results object.

So, you can use the following to get matches and their positions:

#include <iostream>
#include <string>
#include <regex>
using namespace std;

int main() {
    std::regex r(R"(\w)");
    std::string s("a,b,c,d,e,f,g");
    for(std::sregex_iterator i = std::sregex_iterator(s.begin(), s.end(), r);
                             i != std::sregex_iterator();
                             ++i)
    {
        std::smatch m = *i;
        std::cout << "Match value: " << m.str() << " at Position " << m.position() << '\n';
    }
    return 0;
}

See the IDEONE demo

Results:

Match value: a at Position 0
Match value: b at Position 2
Match value: c at Position 4
Match value: d at Position 6
Match value: e at Position 8
Match value: f at Position 10
Match value: g at Position 12

The regex is better declared with a raw string literal (R"(\w)" is a \w regex pattern).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563