1

I have the code below this:

#include <string>
#include <regex>

int main(int argc, char const *argv[]) {
  std::string s = "_apple_";

  std::regex r1("_(\\s|\\S)+_");
  std::regex r2("_[\\s\\S]+_");
  std::regex r3("_.+_");
  std::regex r4("_[pale]+_");

  std::smatch sm;
  printf("r1:%d r2:%d r3:%d r4:%d\n", 
        std::regex_match(s, sm, r1), 
        std::regex_match(s, sm, r2), 
        std::regex_match(s, sm, r3), 
        std::regex_match(s, sm, r4));

  return 0;
}

output:r1:1 r2:0 r3:1 r4:1

I can not understand why r2 is not match?

My environment is:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 10.0.0 (clang-1000.11.45.5) Target: x86_64-apple-darwin17.7.0 Thread model: posix InstalledDir:/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin.

  • Interesting. On a GCC compiler, [your code is working as expected](https://rextester.com/GCZSD11422). – Tim Biegeleisen Nov 19 '18 at 13:40
  • but it does not work on my computer. Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 10.0.0 (clang-1000.11.45.5) Target: x86_64-apple-darwin17.7.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin – user1927896 Nov 19 '18 at 13:43
  • Then maybe shorthands like `\s` and `\S` cannot be used in character classes in your flavor of C++. In any case, your first regex given is a suitable workaround. But +1 to your good question. – Tim Biegeleisen Nov 19 '18 at 13:47
  • I test the code on Mac, Ubuntu and Windows. It works as expected on Ubuntu and Windows. So I guess \s and \S cannot be used in character classes under Apple LLVM or it maybe a bug. I am not sure. – user1927896 Nov 19 '18 at 14:19
  • I don't think the issue is the OS, so much as the version of C++ you are using. – Tim Biegeleisen Nov 19 '18 at 14:20
  • 1
    yes, i agree with you. the issue is not the os. I guess the issue is compiler. – user1927896 Nov 19 '18 at 14:24
  • @user1927896 Can you please provide a link to the compiler reference? – Wiktor Stribiżew Nov 19 '18 at 17:27

1 Answers1

0

The clang regex flavor is POSIX ERE acc. to clang-format regex syntax reference. In POSIX bracket expressions, the usual regex escape sequences, like \s, \d, \w, and even \], are not supported.

The [\s\S] is the same as [\\sS], and matches a backslash, s and S chars.

However, in POSIX regex standard, . matches any chars including line break chars thus there is no need using [\s\S] workaround.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • but, on a online clang compiler, [the code is working as expected](https://rextester.com/KJI53953). I am sorry i can not find the compiler as the same as my computer. On my computer, the compiler is __Apple LLVM version 10.0.0 (clang-1000.11.45.5)__ – user1927896 Nov 20 '18 at 02:04
  • @user1927896 Your is POSIX for sure judging by the behavior. Check [here](https://gist.github.com/yamaya/2924292). – Wiktor Stribiżew Nov 20 '18 at 06:12