File paths can contain many characters that have special meaning in regular expression patterns. Escaping just the backslashes is not enough for robust checking in the general case.
Even a simple path, like C:\Program Files (x86)\Vendor\Product\app.exe
, contains several special characters. If you want to turn that into a regular expression (or part of a regular expression), you would need to escape not only the backslashes but also the parentheses and the period (dot).
Fortunately, we can solve our regular expression problem with more regular expressions:
std::string EscapeForRegularExpression(const std::string &s) {
static const std::regex metacharacters(R"([\.\^\$\-\+\(\)\[\]\{\}\|\?\*)");
return std::regex_replace(s, metacharacters, "\\$&");
}
(File paths can't contain *
or ?
, but I've included them to keep the function general.)
If you don't abide by the "no raw loops" guideline, a probably faster implementation would avoid regular expressions:
std::string EscapeForRegularExpression(const std::string &s) {
static const char metacharacters[] = R"(\.^$-+()[]{}|?*)";
std::string out;
out.reserve(s.size());
for (auto ch : s) {
if (std::strchr(metacharacters, ch))
out.push_back('\\');
out.push_back(ch);
}
return out;
}
Although the loop adds some clutter, this approach allows us to drop a level of escaping on the definition of metacharacters
, which is a readability win over the regex version.