1

I've been sitting here for nearly a day now and cannot figure out, why the C++11 regex library gives me the output it does. It is not about finding the pattern, I already designed and tested it in various regex-testers out there. (Regexpal for example)

An example string I want to process would be:

if12b031, if12b141, ic12a042

These are usernames, containing letters and numbers at a maximum length of 8 characters, each username septerated by a comma. The string is entered by the user and must not end with a comma. The spaces between the commas are optional.

This pattern was my approach to solve this problem:

^[A-z0-9]{1,8}(\s*,\s*[A-z0-9]{1,8})*$

Here the user has to enter a least 1 username, but can enter as many as he wants, as long as they are seperated by comma and have a maximum length of 8 characters.

Now the strang thing is, this pattern works, if I test it in the regex-tester mentioned above. But it doesn't in my code.

I've created a small example program, where it is only about pattern testing.

#include <regex>
#include <string>
#include <iostream>

using namespace std;

int main(int argc, char const *argv[])
{
string tmp;
string pattern = "^[A-z0-9]{1,8}(\\s*,\\s*[A-z0-9]{1,8})*$";

while(true)
{
    getline(cin, tmp);

    cout << "input: " << tmp << endl;
    cout << "pattern: " << pattern << endl;

    try {
        if(regex_match(tmp, regex(pattern, std::regex_constants::basic))) {
            cout << "match" << endl;
        }
        else
        {
            cout << "no match" << endl;
        }
    } catch (std::regex_error& e) {
        cout << e.code() << endl;
    }
}
return 0;
}

I compiled using the following code:

c++ -std=c++11 -o test test.cpp

Now the strange thing is, I cannot even get simple patterns like [A-z]{1,8} to work. It just gives me a match, if I enter a single character, but it also matches if I enter a number and I just don't understand why.

It always prints out "no match", as soon as the input length exceeds 1. And it seems, as regex_match does not care about the pattern, as long as the input length is 1.

Why is that? I honestly can't see where I am making a mistake here. It even matches some special characters like $ or %, but it doesn't match §.

If tried several regex_constants in the constructor of the regex object.

  • extended for example gives me an error code 5 as soon as I add parenthesis. And even without them, it doesn't match any input with more than 1 character.

  • basic doesn't throw any error, but it is still the same strange behaviour.

  • ECMAScript complains with error code 4, which means brackets.

I am honestly out of ideas, why this doesn't work.

I am running Ubuntu 13.10 64bit Gnome in a virtual machine (VMWare), but I also tried it on my laptop, where it is installed as a dual-boot system. gcc version is 4.8.1.

As this is my first question, I hope I provided enough details for you guys to help me out. Thanks in advance.

Thomas Eizinger
  • 1,404
  • 14
  • 25

1 Answers1

5

gcc's regex implementation might compile, but that's about it, it is mainly unimplemented in gcc 4.8 (see item 28).

KillianDS
  • 16,936
  • 4
  • 61
  • 70
  • Yeah, even though GCC 4.8 is fine on the core language side, it still lacks a lot on the standard library side. Unfortunately, regexes are but one known issue. :( – syam Oct 19 '13 at 23:46
  • Okay, that explains a lot. Thanks for the information guys, I will need to check the syntax of the input "by hand", which is what I wanted to avoid in the first place. – Thomas Eizinger Oct 19 '13 at 23:53
  • 1
    @ThomasEizinger If you have to use gcc, you could just use boost.regex, which is pretty much what was adopted as the C++ TR1/C++11 standard library component. – Cubbi Oct 19 '13 at 23:58
  • Poco is another library similar to boost that I find easy to use. – Jason Enochs Oct 20 '13 at 00:04
  • PCRE is another alternative regex library (with `pcre++` as a C++ wrapper). – syam Oct 20 '13 at 00:19
  • Since I don't have any other use for boost in my program and the problem is rather simple to solve without regex, I will go for the manual way, but thanks for the suggestions. – Thomas Eizinger Oct 20 '13 at 00:22
  • It kind of makes you wonder why it even compiles if it doesn't work. – Gabe Oct 20 '13 at 00:56
  • 2
    Just for completeness I'll mention that the recently released version of libstdc++ has `` implemented. – R. Martinho Fernandes Oct 20 '13 at 11:59
  • 1
    give gcc 4.9 trunk a chance. regex is going to trunk last week :-) This version is not released but maybe it works as expected. – Klaus Oct 20 '13 at 15:50