3

In C++17 if we design a class like this:

class Editor {
public:
  // "copy" constructor
  Editor(const std::string& text) : _text {text} {}

  // "move" constructor
  Editor(std::string&& text) : _text {std::move(text)} {}

private:
  std::string _text;
}

It might seem (to me at least), that the "move" constructor should be much faster than the "copy" constructor.

But if we try to measure actual times, we will see something different:

  int current_time()
  {
    return chrono::high_resolution_clock::now().time_since_epoch().count();
  }

 int main()
 {
   int N = 100000;

   auto t0 = current_time();
   for (int i = 0; i < N; i++) {
     std::string a("abcdefgh"s);
     Editor {a}; // copy!
   }
   auto t1 = current_time();
   for (int i = 0; i < N; i++) {
     Editor {"abcdefgh"s};
   }
   auto t2 = current_time();

   cout << "Copy: " << t1 - t0 << endl;
   cout << "Move: " << t2 - t1 << endl;
 }

Both copy and move times are in the same range. Here's one of the outputs:

Copy: 36299550
Move: 35762602

I tried with strings as long as 285604 characters, with the same result.

Question: why is "copy" constructor Editor(std::string& text) : _text {text} {} so fast? Doesn't it actually creates a copy of input string?

Update I run the benchmark given here using the following line: g++ -std=c++1z -O2 main.cpp && ./a.out

Update 2 Fixing move constructor, as @Caleth suggests (remove const from the const std::string&& text) improves things!

Editor(std::string&& text) : _text {std::move(text)} {}

Now benchmark looks like:

Copy: 938647
Move: 64
metal
  • 6,202
  • 1
  • 34
  • 49
dimakura
  • 7,575
  • 17
  • 36
  • I'd expect to see both loops be translated into precisely the same assembly, because compiler should notice that string `a` is not used for anything else than initializing `Editor` and optimize it away. – Yksisarvinen May 24 '19 at 12:08
  • I would print the strings (ouside of the performance measurement), to prevent compiler to optimize anything away – Regis Portalez May 24 '19 at 12:10
  • Such short strings may be stored in an internal buffer. Look up "short string optimization" for more info. – D Drmmr May 24 '19 at 12:15
  • @DDrmmr I doubt SSO would kick in for strings 285,604 characters long, though for the case shown in the question's code that's a good point. – Lightness Races in Orbit May 24 '19 at 12:19
  • You did not provide how the code was compiled. If it wasn't optimized, the results are irrelevant. – Eljay May 24 '19 at 12:29
  • 5
    `const std::string&&` looks like a typo. You can't move from it. – Caleth May 24 '19 at 12:35
  • OP has edited it, it wasn't originally const std::string&&, suspect he intended to add the const to the copy constructor. – Benj May 24 '19 at 12:37
  • 1
    On libc++, [move is faster.](http://quick-bench.com/UX_yMZA9kbOsX3J7j734mD1yPSU). On libstdc++, [copy is faster](http://quick-bench.com/9XFlnucs4RK6v9m2i4WwkaL8Hbc). Just kill me now, huh. – Nikos C. May 24 '19 at 12:38
  • @Caleth I think you are right. `const string&&` was fine with compiler, but once I removed `const`, it now runs much much faster! – dimakura May 24 '19 at 12:39
  • Hah oh yeah. That's the answer. Please post as such @Caleth – Lightness Races in Orbit May 24 '19 at 12:56

4 Answers4

3

It also depends on your optimization flags. With no optimization, you can (and I did!) get even worse results for the move:

Copy: 4164540
Move: 6344331

Running the same code with -O2 optimization gives a much different result:

Copy: 1264581
Move: 791

See it live on Wandbox.

That's with clang 9.0. On GCC 9.1, the difference is about the same for -O2 and -O3 but not quite as stark between copy and move:

Copy: 775
Move: 508

I'm guessing that's a small string optimization kicking in.

In general, the containers in the standard library work best with optimizations on because they have a lot of little functions that the compiler can easily inline and collapse when asked to do so.

Also in that first constructor, per Herb Sutter, "Prefer passing a read-only parameter by value if you’re going to make a copy of the parameter anyway, because it enables move from rvalue arguments."


Update: For very long strings (300k characters), the results are similar to the above (now using std::chrono::duration in milliseconds to avoid int overflows) with GCC 9.1 and optimizations:

Copy: 22560
Move: 1371

and without optimizations:

Copy: 22259
Move: 1404
metal
  • 6,202
  • 1
  • 34
  • 49
  • I used `const string&` in actual example and missed it when copying example. I updated my question to better reflect that. Without `const string&` optimization do give improved performance, but when `const string&` is used, there seems to be no difference. – dimakura May 24 '19 at 12:34
  • Regarding SSO, note that the OP also discusses strings of size up to 285,604. – Lightness Races in Orbit May 24 '19 at 12:44
  • It will also depend on the compiler version and stdlib version. Which are you using? Switching Wandbox to GCC 5 or earlier shows much less dramatic improvement, which could mean it all comes down to the small string optimizations introduced in the GCC standard library (at least in the code as posted, not the super long strings). – metal May 24 '19 at 12:45
  • @LightnessRacesinOrbit: Correct, but the code as written could live in SSO land. – metal May 24 '19 at 12:46
2

const std::string&& looks like a typo.

You can't move from it, so you get a copy instead.

Caleth
  • 52,200
  • 2
  • 44
  • 75
  • Correct. The OP edited the code and meant to add that to the first ctor, not the second. – metal May 24 '19 at 13:15
2

So your tests is really looking at the number of times we have to "build" a string object.

So in the fist test:

for (int i = 0; i < N; i++) {
  std::string a("abcdefgh"s);    // Build a string once.
  Editor {a}; // copy!           // Here you build the string again.
}                                // So basically two expensive memory
                                 // allocations and a copying the string

While in the second test:

for (int i = 0; i < N; i++) {
  Editor {"abcdefgh"s};         // You build a string once.
                                // Then internally you move the allocated
                                // memory (so only one expensive memory
                                // allocation and copying the string
}

So the difference between the two loops is one extra string copy.

The problem here. I as a human can spot one easy peephole optimization (and the compiler is better than me).

for (int i = 0; i < N; i++) {
  std::string a("abcdefgh"s);   // This string is only used in a single
                                // place where it is passed to a
                                // function as a const parameter

                                // So we can optimize it out of the loop.

  Editor {a};
}

So if we do a manually yanking of the string outside the loop (equivalent to a valid compiler optimization).

So this loop has the same affect:

std::string  a("abcdefgh"s); 
for (int i = 0; i < N; i++) {
  Editor {a};
}

Now this loop only has 1 allocation and copy.
So now both loops look the same in terms of the expensive operations.

Now as a human I am not going to spot (quickly) all the optimization possible. I am just trying to point out here that your quick test here you will not spot a lot of optimizations that the compiler will do and thus estimations and doing timings like this are hard.

Martin York
  • 257,169
  • 86
  • 333
  • 562
0

On paper you're right, but in practice this is quite easily optimisable so you'll probably find the compiler has ruined your benchmark.

You could benchmark with "optimisations" turned off, but that in itself holds little real-world benefit. It may be possible to trick the compiler in release mode by adding some code that prevents such an optimisation, but off the top of my head I can't imagine what that would look like here.

It's also a relatively small string that can be copied really quickly nowadays.

I think you should just trust your instinct here (because it's correct), while remembering that in practice it might not actually make a lot of difference. But the move certainly won't be worse than the copy.

Sometimes we can and should write obviously "more efficient" code without being able to prove that it'll actually perform better on any particular day of the week with any particular phase of the moon/planetary alignment, because compilers are already trying to make your code as fast as possible.

People may tell you that this is therefore a "premature optimisation", but it really isn't: it's just sensible code.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • I run it with `g++ -std=c++1z -O0 main.cpp && ./a.out` with the same results – dimakura May 24 '19 at 12:13
  • @dimakura Ditto with visual studio 2017.. The move constructor version runs slower in both debug and optimized mode. However, increasing the string size (so that SSO is not used makes the move constructor version run more quickly in both modes) – Benj May 24 '19 at 12:19
  • Get a hold of this one. On libc++, [move is faster.](http://quick-bench.com/UX_yMZA9kbOsX3J7j734mD1yPSU). On libstdc++, [copy is faster](http://quick-bench.com/9XFlnucs4RK6v9m2i4WwkaL8Hbc). – Nikos C. May 24 '19 at 12:30