Update 2: Actually it's the regex(".{40000}");
. That alone already takes that much time. Why?
regex_match("", regex(".{40000}"));
takes almost 8 seconds on my PC. Why? Am I doing something wrong? I'm using gcc 4.9.3 from MinGW on Windows 10 on an i7-6700.
Here's a full test program:
#include <iostream>
#include <regex>
#include <ctime>
using namespace std;
int main() {
clock_t t = clock();
regex_match("", regex(".{40000}"));
cout << double(clock() - t) / CLOCKS_PER_SEC << endl;
}
How I compile and run it:
C:\Users\ ... \coding>g++ -std=c++11 test.cpp
C:\Users\ ... \coding>a.exe
7.643
Update: Looks like the time is quadratic in the given number. Doubling it roughly quadruples the time:
10000 0.520 seconds (factor 1.000)
20000 1.922 seconds (factor 3.696)
40000 7.810 seconds (factor 4.063)
80000 31.457 seconds (factor 4.028)
160000 128.904 seconds (factor 4.098)
320000 536.358 seconds (factor 4.161)
The code:
#include <regex>
#include <ctime>
using namespace std;
int main() {
double prev = 0;
for (int i=10000; ; i*=2) {
clock_t t0 = clock();
regex_match("", regex(".{" + to_string(i) + "}"));
double t = double(clock() - t0) / CLOCKS_PER_SEC;
printf("%7d %7.3f seconds (factor %.3f)\n", i, t, prev ? t / prev : 1);
prev = t;
}
}
Still no idea why. It's a very simple regex and the empty string (though it's the same with short non-empty strings). It should fail instantly. Is the regex engine just weird and bad?