11

I have to replace a few (fixed) amount of characters in a long string: I'm wondering what is the fastest, but standard compliant way.

Here is a sample code with 6 different methods; in the comment of the method I've added the time in milliseconds to execute the operation 1 million times in a test environment, with optimizations enabled.

const char* pluto = "Cia1234567Ciao!";
std::string rep = "87654321";
std::string r1 = pluto, r2 = pluto, r3 = pluto, r4 = pluto, r5 = pluto, r6 = pluto;

// (1) 300 msec
r1.replace(3, 7, rep.substr(1));  

// (2) 40 msec
std::copy(rep.begin() + 1, rep.end(), r2.begin() + 3);

// (3) 32 msec
for (int i = 1; i < 8; ++i)
    r3[2 + i] = rep[i];

// (4) 14 msec
{
    const char *c = rep.c_str() + 1;
    for (int i = 0; i < 7; ++i)
        r4[3 + i] = *c++;
}

// (5) 3 msec (BEST)
memcpy(&r5[3], &rep[1], 7);

// (6) 100 msec
r6.replace(3, 7, rep.c_str() + 1);

So the fastest way seems (5), but I fear that this method may not work correctly with the "copy-on-write" std::string optimization that many compilers use.

IMHO (5) is also the more readable.

I wonder why (4) is twice as fast as (3), I thought that operator[] of std::string was quite optimized...


UPDATE:

After reading comments I've updated my code to use the google benchmark library and the results of (3) and (4) seems to be the same, the other differences still apply:

Run on (2 X 3000 MHz CPU s)
2015-11-24 14:46:50
Benchmark                   Time(ns)    CPU(ns) Iterations
-----------------------------------------------------------
(1) bench_replace_substr        293        264     2651515
(2) bench_std_copy               39         39    19662921
(3) bench_op_bracket             15         15    39772727
(4) bench_op_bracket_2           15         15    44871795
(5) bench_memcpy                  4          4    75000000
(6) bench_replace                80         80     8333333

So the differences in (3) and (4) are gone, but the rest of the results are the same :)

gabry
  • 1,370
  • 11
  • 26
  • Notice COW is no more allowed for `std::string` since C++11. – edmz Nov 24 '15 at 12:26
  • @black SSO? Do you mean Small String Optimization? That is certainly legal in C++11. What became illegal was Copy on Write. However, the latter is still used by all libstdc++ 4.x. – Baum mit Augen Nov 24 '15 at 12:30
  • 1
    Are optimizations enabled? On my it's 15 3 0 0 0 4 in microseconds. – ForEveR Nov 24 '15 at 12:30
  • 5
    3 milliseconds for a `memcpy()` of 7 bytes? Something is wrong with your performance numbers. – Andrew Henle Nov 24 '15 at 12:31
  • Yes I did. Thanks, fixed. – edmz Nov 24 '15 at 12:32
  • 2
    Those numbers seem fishy (but good job providing them at least ;) ). E.g. I see absolutely no reason for the big discrepancy between 3) and 4). Actually, Benchmarking operations that fast can be quite difficult. For instance, are you sure the operations you want to measure did not get optimized away in some cases? [This](https://www.youtube.com/watch?v=nXaxk27zwlk) is an interesting talk on Microbenchmarks like this. – Baum mit Augen Nov 24 '15 at 12:35
  • @Andrew The benchmarks are with 1 millions of iterations. My main issue is to understand if copy-on-write could break (5). – gabry Nov 24 '15 at 13:12
  • @black I know that copy-on-write is no more legal in C++ 11, but gcc, up to 5.0 uses that by default... – gabry Nov 24 '15 at 13:16
  • @Baum optimizations are enabled, so maybe some code got removed, but in the full code (that includes loops and timing that I removed from the post for clarity) there is also a check on the final value, so at least the last iteration of every "method" does the right thing :) – gabry Nov 24 '15 at 13:16
  • 2
    @gabry That's not really indicating the compiler didn't optimize everything but the final result out. – edmz Nov 24 '15 at 13:37
  • @black using the google framework every method is a different function, that the "benchmark" main calls an arbitrary number of ways. I don't think it can be optimized out in that way, also times are too high to be "optimized" out, the fastest method requires 4ns and the slowest 300ns. – gabry Nov 24 '15 at 15:00
  • You should add `std::string::copy` to this list which makes a call to `memcpy` in my implementation, while `std::copy` makes a call to `memmove`. – Daniel Dec 11 '15 at 21:10

1 Answers1

4

The method using memcpy is standard-compliant at least since C++11 because

  1. As explained in this answer copy-on-write implementation of std::string is not allowed, because it violates standard invalidation of iterators/references requirements.

  2. std::string's characters are stored in contiguous memory, quoting 21.4.1.5:

    The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

So it is the fastest standard-compliant of the methods in your list (at least according to your benchmark results).

In fact, this should be safe even with a non-standard-compliant implementation that does copy-on-write because non-const operator[] should make a copy of the string, for example:

std::string s1("foo");
std::string s2 = s1;
std::cout << static_cast<const void*>(s1.data()) << " "
          << static_cast<const void*>(s2.data()) << "\n";
s2[0];
std::cout << static_cast<const void*>(s1.data()) << " "
          << static_cast<const void*>(s2.data()) << "\n";

prints

0x1782028 0x1782028
0x1782028 0x1782058

when I compile it with gcc 4.8.4 and a fairly old version of libstdc++ and run. Note that the pointers are different after the call to non-const operator[] meaning that the data has been copied.

Knowing that non-const operator[] does some checks in COW implementation, it might be possible to speed up even more by calling const operator[]:

const std::string &crep = rep;
memcpy(&r5[3], &crep[1], 7);

which is indeed faster on my system:

Benchmark              Time(ns)    CPU(ns) Iterations
-----------------------------------------------------
bench_memcpy_const            2          2  314215561 
bench_memcpy                  3          3  276899830 
Community
  • 1
  • 1
vitaut
  • 49,672
  • 25
  • 199
  • 336