1

On clang version 5.0.1 the following code

#include <iostream>
#include <string>

using namespace std;

int main(int argc, char *argv[]) {
    string str = "abcdefghijklmnopqrstuvwxyz";

    for (int i = 0; i < 5000000; i++) {
        str.substr(4);
    }

    return 0;
}

gives me the following timings

real    0m0.104s
user    0m0.096s
sys     0m0.002s

Now if I change it such that I have str.substr(3), I end up with the following timings

real    0m0.478s
user    0m0.468s
sys     0m0.003s

As far as I can tell gcc does not suffer from this. Could anyone shed any light on what may the problem here?

EDIT:

More importantly, might anyone have a suggestion about how to avoid this huge performance hit?

EDIT2:

It appears likely this is some kind of short string optimisation trickery, particularly since 22 is magic number in that context.

shians
  • 955
  • 1
  • 6
  • 21
  • Might be short-string-optimization, not sure though. – tkausl Apr 12 '18 at 02:57
  • The traditional question: how do you compile your program, which compiler optimization parameters do you apply? – 273K Apr 12 '18 at 03:00
  • `clang++ -o test test.cpp`, while setting a optimisation level helps with the shorter string, it does not help on the longer string. – shians Apr 12 '18 at 03:04
  • [Cannot reproduce on TIO](https://tio.run/##3ZC9UsMwEIR7PcVOKJJgYjJAKht30OYBGIqzrNgilmz0E/4mz27OTpihp@OaPe2svpnbknwzDJIC8hzzh@3jHAUolX0vLrSVbawUct354BSZ4pfHjrZ1IUT0rLBklO9JKvhQZUJoG2BI28W4kKvlFWRDDpe8H56el/gS4DlRRsE9ZlTKSu3qRr/sW2O7/tX5EA9v7x@fM0aO@V3nMCE159cZS47Nehp@JMkP98xOfSxZFnfLbLKPJ4pTITrL/8VR8MlCyJZsnSRYbW@w6kDnBoI2Cuk1ceCfF3T7x4KG4Rs). – user202729 Apr 12 '18 at 03:04
  • I've edited the question. The example was originally `str.substr(str.find('e'))` and `str.substr(str.find('d'))` but I changed it for clarity and made a mistake. – shians Apr 12 '18 at 03:05
  • This isn't how you microbenchmark. However, I'd guess the short string optimization made the optimizer made the correct inlining and eliminations to make it so fast. I doubt this is a big issue in practice – Passer By Apr 12 '18 at 04:18
  • 1
    It's a big issue because the substr calls are taking up 4 minutes of a 6 minute computation, if I can fix this I can save at least 50% of my running time. – shians Apr 12 '18 at 05:39

0 Answers0