Why is std::unique_ptr much slower than standard pointer... before optimizations?

Question

Lesson learned, always use optimizations when doing benchmarks...

I decided to look at std::unique_ptr as an alternative for my program. The reasons as to why are not important.

After using compiler optimizations, they seems to take equivalent amounts of time.

How I tested:

time_t current_time;
time(&current_time);
srand((int)current_time);
//int* p0 = new int[NUM_TESTS];
//int* p1 = new int[NUM_TESTS];
std::unique_ptr<int[]> u_p0{ new int[NUM_TESTS] };
std::unique_ptr<int[]> u_p1{ new int[NUM_TESTS] };
for (unsigned i = 0; i < NUM_TESTS; ++i){
    u_p0[i] = rand(); // Use p0 and p1 for the standard ptr test
    u_p1[i] = rand();
}
int result;
auto start = std::chrono::steady_clock::now();
for (unsigned index = 0; index < NUM_TESTS; ++index){
    result = u_p0[index] + u_p1[index]; // Use p0 and p1 for standard ptr test
}
auto end = std::chrono::steady_clock::now();
double duration = std::chrono::duration_cast<std::chrono::duration<double>>(end - start).count();
printf("time: %f\n", duration);

My environment:

Windows 8.1 64 bit
MSVC compiler in Visual Studio 2013. Using Release.
Intel 4790k (standard clock)
1600 MHz RAM

My results (using optimized compilation):

// NUM_TESTS = 1,000,000

/*
STD:
    0.001005
    0.001001
    0.001000
    0.001000
    0.001015
*/
/*
unique_ptr:
    0.001000
    0.001000
    0.000997
    0.001000
    0.001017
*/

Did you compile in release mode and enable optimizations ? My experience is that `std::unique_ptr` often help MSVC to do better optimizations. — ElderBug, Mar 06 '15 at 10:11
@ElderBug compiling in release mode make the output of `duration` `0` when it is visibly taking time to run the test, I even increased the number of loops to where it could definitely not be measured as zero time taken. — MichaelMitchell, Mar 06 '15 at 10:20
@MichaelMitchell Try with `volatile int result;`. The problem with optimisations enabled is that the compiler will detect when you do useless operations and remove them. — ElderBug, Mar 06 '15 at 10:23
@chmike It's not forbidden, it's **utterly pointless.** If you want to determine the fastest runner, you don't do so by finding who can read maps best. — Angew is no longer proud of SO, Mar 06 '15 at 10:26
Benchmarking != determine the fastest runner. It is to test and measure time. The question is perfectly justified and the answer enlightening. — chmike, Mar 06 '15 at 10:27
You people are insane. You all go straight for closing the topic saying that since I didn't do it to your standards it is wrong. I took the given advice, I updated my question, added what was asked for, but you people don't care. I don't understand how this sort of mentality develops in people, this website is for helping others, and by your actions I would argue your intent is not aligned with the purpose this glorious place was made for. — MichaelMitchell, Mar 06 '15 at 10:40
@chmike This is what the OP stated: `I decided to look at std::unique_ptr as an alternative for my program` So given the original results from the unoptimized code, would you say that `std::unique` is a good alternative? If the goal of the OP is to replace raw pointers in their program, wouldn't it be best to point out that benchmarking unoptimized code is (as another commenter put it) *pointless* in determining whether std::unique should be used? — PaulMcKenzie, Mar 06 '15 at 10:46
@chmike benchmarking, as in measuring how fast code is, is not meaningful if you don't ask the compiler to generate fast code. It's like how we tell athletes to run when a race starts. We don't benchmark Usain Bolt by how fast he runs when no one told him to run. — jalf, Mar 06 '15 at 11:39
We have a very good quality answer to this average-quality question, that will help other people understand the mistake of the OP. That's why I don't see any reason to bash this guy with downvotes, close proposals, a public crucifixion, etc... Some of you people like to be swaggers. — gd1, Mar 06 '15 at 14:01
I saw this perf drop while benchmarking CUDA kernels (and cpu post processing using `std::unique_ptr`). They were prebuilt using nvcc (and different flags) so I didnt pay attention to my code being in debug mode. Unfarily downvoted imo, title is accurate. — r11, Sep 16 '18 at 08:53

score 14 · Answer 1 · answered Mar 06 '15 at 10:15

Because "standard compiler flags" means you're compiling without optimizations enabled.

std::unique_ptr is a thin wrapper around a raw pointer. As such, when dereferencing it, it goes through a very simple forwarding function, which the compiler is able to optimize away. But it only does that if optimizations are enabled. If they are, then it can eliminate the overhead of going through the wrapper, so performance will be the same as if you'd just used a raw pointer.

But if you don't ask the compiler to optimize your code, then every time you access the pointer, it has to go through the small wrapper function to get to the actual internal pointer.

Always, always enable optimization when benchmarking code.

Why is std::unique_ptr much slower than standard pointer... before optimizations?

1 Answers1

Linked