I'm using std::regex_replace
in a C++ Windows project (Visual Studio 2010). The code looks like this:
std::string str("http://www.wikipedia.org/");
std::regex fromRegex("http://([^@:/]+\\.)?wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
std::string fmt("https://$1wik$2.org/");
std::string result = std::regex_replace(str, fromRegex, fmt);
I would expect result
to be "https://www.wikipedia.org/"
, but I get "https://www.wikipedia.wikipedia.org/"
.
A quick check with sed
gives me the expected result
$ cat > test.txt
http://www.wikipedia.org/
$ sed 's/http:\/\/([^@:\/]+\.)?wik(ipedia|imedia)\.org\//https:\/\/$1wik$2.org\//' test.txt
http://www.wikipedia.org/
I don't get where the difference comes from. I checked the flags that can be used with std::regex_replace
, I didn't see one that would help in this case.
Update
These variants work fine:
std::regex fromRegex("http://([^@:/]+\\.)wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
std::regex fromRegex("http://((?:[^@:/]+\\.)?)wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
std::regex fromRegex("http://([a-z]+\\.)?wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
std::regex fromRegex("http://([^a]+\\.)?wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
bu not these:
std::regex fromRegex("http://([^1-9]+\\.)?wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
std::regex fromRegex("http://([^@]+\\.)?wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
std::regex fromRegex("http://([^:]+\\.)?wik(ipedia|imedia)\\.org/", std::regex_constants::icase);
It makes no sense to me...