0

I want to know which of these two blocks runs faster:

std::string tempMsg( 13000, '\0' ); // constructs the string with a 13000 byte buffer from the start
tempMsg.clear( ); // clears those '\0' chars to free space for the actual data

or

std::string tempMsg; // default constructs the string with a 16-byte buffer on the stack??
tempMsg.reserve( 13000 ); // reallocates the string to increase the buffer size to 13000 bytes??

So I have the following program:

#include <iostream>
#include <string>
#include <chrono>


int main()
{
    using std::chrono::high_resolution_clock;
    using std::chrono::duration;

    auto t1 = high_resolution_clock::now();

     // std::string tempMsg( 13000, '\0' ); // constructs the string with a 13000 byte buffer from the start
     // tempMsg.clear( );

    std::string tempMsg; // default constructs the string with a 16-byte buffer on the stack??
    tempMsg.reserve( 13000 );

    auto t2 = high_resolution_clock::now();

    /* Getting number of milliseconds as a double. */
    duration<double, std::milli> ms_double = t2 - t1;

    std::cout << ms_double.count() << "ms";

    return 0;
}

Is this kind of benchmarking going to give me some worthy results? Or is it flawed and misleads me with unrealistic results?

Also by comparing the results, I saw ~%50 less time consumed by using the second block (the one which calls std::string::reserve). So does this mean the 2nd solution runs ~%50 faster?

digito_evo
  • 3,216
  • 2
  • 14
  • 42
  • 4
    Your code is very far away from a benchmark, since it only runs a single iteration (and might even be optimized out completely by aggressive enough compilers). That being said you have different behaviors - the constructor call has to fill the allocated memory, the `reserve` call does not have to do anything with it – UnholySheep Nov 08 '21 at 23:55
  • @UnholySheep True. But I just need to construct a single string with a certain capacity to append some other strings to it later. I want to know if this code gives real results or not. – digito_evo Nov 08 '21 at 23:59
  • 2
    Your benchmarking code is too flawed to provide accurate results, but just logically thinking filling and clearing a buffer (your use of constructor + `clear`) will be slower than not doing those two steps (your use of `reserve`) – UnholySheep Nov 09 '21 at 00:02
  • 1
    Also note that `clear()` does not necessarily "free up space". In most implementations it just changes the capacity, but not the size. To change the size you need to call [`shrink_to_fit`](https://en.cppreference.com/w/cpp/string/basic_string/shrink_to_fit) – Human-Compiler Nov 09 '21 at 00:02
  • 2
    No good guarantee of usable results. A smart optimizing compiler will recognize that absolutely nothing happens with `tempMsg` and discard it. Elapsed time could be nanoseconds. Even if it's not optimized out, reserve only asks the runtime for storage. The runtime could merely respond with , "Yeah. Sure." and then do nothing until you actually use the storage for something and **need** the storage. Then you'll see the hit. – user4581301 Nov 09 '21 at 00:05
  • 1
    @Human-Compiler `shrink_to_fit` also is not required to actually do that. – NathanOliver Nov 09 '21 at 00:05
  • 5
    @OP Micro benchmarking is hard to do right. Use a tool for it instead like this live version of google benchmark: https://quick-bench.com/ – NathanOliver Nov 09 '21 at 00:05
  • Make sure you test an optimised build rather than a debug build. – Jesper Juhl Nov 09 '21 at 00:10
  • @Jesper Juhl I compiled using -O3 -flto with GCC. I guess that's some heavy optimization. The difference in size between the two executable files was 7 bytes. – digito_evo Nov 09 '21 at 00:22
  • 1
    IMO, for benchmarking, you need to: *1)* _Compile with optimizations on._ *2)* _Record an unpredictable side effect of the test._ (like a check sum) *3)* _Display the recorded side effect after the test._ That way, the compiler can't optimize the test code away. – Galik Nov 09 '21 at 01:08
  • 2
    [Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987) covers benchmarking in general, but your specific problem has other challenges like not even using the allocation, as other commenters point out. Maybe it wasn't the best choice of duplicate, but should point you in the direction of things you need to consider, like soft page faults on newly allocated memory if it didn't come from a free list, if it's written with zeros. Or deferring that until later code actually touches it, which might not happen in your microbenchmark but would happen in real use. – Peter Cordes Nov 09 '21 at 01:08

0 Answers0