Make sure to run benchmark code with --release
, otherwise the results will be pretty much meaningless.
In your case, if I run it with --release
, I get:
Simple sum: 100ns
Many heap calls: 100ns
This means that the compiler completely optimized away everything, because your loops had zero side effects. If (apart from the time it would take) an operation has no effect, the compiler is allowed to simply remove it.
Note that the compiler even warns:
warning: variable `sum` is assigned to, but never used
--> src\main.rs:5:13
|
5 | let mut sum = 0;
| ^^^
|
= note: consider using `_sum` instead
= note: `#[warn(unused_variables)]` on by default
That said, there are situations where you want to keep the operation even though they have no side effect, like for benchmarking. For that, Rust provides std::hint::black_box
, which is a function that returns exactly what you give to it, but looks to the compiler as if some fancy calculation would take place so that the compiler can no longer prove that the input is equal to the output. That prevents the compiler from optimizing this function away, and with that everything that feeds into it.
In your case, this is one example of how you could prevent Rust from optimizing away your loop:
use std::time::Instant;
fn main() {
let start = Instant::now();
let mut sum = 0;
for _ in 0..100000 {
sum += 42;
std::hint::black_box(sum);
}
println!("Simple sum: {:?}", start.elapsed());
let start2 = Instant::now();
for _ in 0..100000 {
let b = std::hint::black_box(Box::new(42));
std::hint::black_box(Box::leak(b));
}
println!("Many heap calls: {:?}", start2.elapsed());
}
Simple sum: 27.2µs
Many heap calls: 2.8956ms
Now those numbers make more sense.
To be 100% sure that it didn't optimize away anything important, you could always check with the disassembly. Because asm surrounding println!()
s is hard to read, it makes sense to extract them into their own functions.
Be sure to make those functions pub
to make them show up in the final assembly, otherwis they might disappear due to inlining.
Here is how this would look:
use std::time::Instant;
pub fn simple_sum() {
let mut sum = 0;
for _ in 0..100000 {
sum += 42;
std::hint::black_box(sum);
}
}
pub fn many_heap_calls() {
for _ in 0..100000 {
let b = std::hint::black_box(Box::new(42));
std::hint::black_box(Box::leak(b));
}
}
fn main() {
let start = Instant::now();
simple_sum();
println!("Simple sum: {:?}", start.elapsed());
let start2 = Instant::now();
many_heap_calls();
println!("Many heap calls: {:?}", start2.elapsed());
}
example::simple_sum:
sub rsp, 4
mov eax, 42
mov rcx, rsp
.LBB0_1:
mov dword ptr [rsp], eax
add eax, 42
cmp eax, 4200042
jne .LBB0_1
add rsp, 4
ret
example::many_heap_calls:
push r15
push r14
push rbx
sub rsp, 16
mov ebx, 100000
mov r14, qword ptr [rip + __rust_alloc@GOTPCREL]
lea r15, [rsp + 8]
.LBB1_1:
mov edi, 4
mov esi, 4
call r14
test rax, rax
je .LBB1_4
mov dword ptr [rax], 42
mov qword ptr [rsp + 8], rax
mov rax, qword ptr [rsp + 8]
mov qword ptr [rsp + 8], rax
dec ebx
jne .LBB1_1
add rsp, 16
pop rbx
pop r14
pop r15
ret
.LBB1_4:
mov edi, 4
mov esi, 4
call qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
ud2
The important part to notice here is the .LBB0_1:
, .LBB1_1:
and the jne .LBB0_1
and jne .LBB1_1
, which are the two for
loops. This shows that the loops did not get optimized away.
Also note the mov r14, qword ptr [rip + __rust_alloc@GOTPCREL]
and call r14
, which is the actual call that does the heap allocation. So this one also didn't get optimized away.
Also, notice the interesting looking cmp eax, 4200042
. This one shows that it reworked the first loop; instead of doing:
let mut sum = 0;
for _ in 0..100000 {
sum += 42;
}
it optimized it to
let mut sum = 0;
while sum != 4200042 {
sum += 42;
}
which does in fact give the same result and reuses the sum
variable as the loop counter :)
Now compared to how it was before:
use std::time::Instant;
pub fn simple_sum() {
let mut sum = 0;
for _ in 0..100000 {
sum += 42;
}
}
pub fn many_heap_calls() {
for _ in 0..100000 {
let b = Box::new(42);
Box::leak(b);
}
}
fn main() {
let start = Instant::now();
simple_sum();
println!("Simple sum: {:?}", start.elapsed());
let start2 = Instant::now();
many_heap_calls();
println!("Many heap calls: {:?}", start2.elapsed());
}
example::simple_sum:
ret
example::many_heap_calls:
ret
I don't think this one requires further explanation ;)