How to avoid elimination of dead code in BenchmarkDotNet?

Question

The BenchmarkDotNet documentation says our benchmarks should avoid elimination of dead code due to the result of the process being benchmarked not being used, but doesn't go into much detail about how to do this:

You should also use the result of calculation. For example, if you run the following code:
void Foo()
{
    Math.Exp(1);
}
then JIT can eliminate this code because the result of Math.Exp is not used. The better way is use it like this:
double Foo()
{
    return Math.Exp(1);
}

So I thought I'd try an experiment, to see whether assigning the result of what I'm benchmarking to a public property of the class containing the benchmark methods is enough to avoid the JIT compiler considering any of my code to be dead.

Here are my benchmarks and a bit of contrived code to test the performance of:

    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Jobs;

    [SimpleJob(RuntimeMoniker.NetCoreApp31)]
    [SimpleJob(RuntimeMoniker.Net50)]
    public class MyBenchmarks
    {
        private readonly int limit = 10000000;
        public static int StaticProperty { get; set; }
        public int InstanceProperty { get; set; }

        [Benchmark]
        public void A()
        {
            this.Count(this.limit);
        }

        [Benchmark]
        public void B()
        {
            this.CountTo(this.limit);
        }

        [Benchmark]
        public void C()
        {
            StaticProperty = this.CountTo(this.limit);
        }

        [Benchmark]
        public void D()
        {
            this.InstanceProperty = this.CountTo(this.limit);
        }

        private void Count(int limit)
        {
            for (var i = 0; i < limit; i++)
            {
            }
        }

        private int CountTo(int limit)
        {
            var returnValue = 0;
            for (var i = 0; i < limit; i++)
            {
                returnValue++;
            }

            return returnValue;
        }
    }

And my Program.cs:

    using BenchmarkDotNet.Running;

    public static class Program
    {
        public static void Main(string[] args)
        {
            BenchmarkRunner.Run<MyBenchmarks>();
        }
    }

I was expecting method A, and maybe also B, to be significantly faster than methods C and D, because A executes a loop which doesn't do anything and B executes a loop but doesn't make use of the result, and so in both cases the JIT compiler might optimise the loop out of the final executable.

The actual result is that there doesn't seem to be a significant difference in the execution times of any of the methods:

// * Summary *

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1466 (21H1/May2021Update)
Intel Core i5-2320 CPU 3.00GHz (Sandy Bridge), 1 CPU, 4 logical and 4 physical cores
.NET SDK=5.0.404
  [Host]        : .NET Core 3.1.22 (CoreCLR 4.700.21.56803, CoreFX 4.700.21.57101), X64 RyuJIT
  .NET 5.0      : .NET 5.0.13 (5.0.1321.56516), X64 RyuJIT
  .NET Core 3.1 : .NET Core 3.1.22 (CoreCLR 4.700.21.56803, CoreFX 4.700.21.57101), X64 RyuJIT


| Method |           Job |       Runtime |     Mean |     Error |    StdDev |
|------- |-------------- |-------------- |---------:|----------:|----------:|
|      A |      .NET 5.0 |      .NET 5.0 | 3.174 ms | 0.0170 ms | 0.0159 ms |
|      B |      .NET 5.0 |      .NET 5.0 | 3.322 ms | 0.0124 ms | 0.0116 ms |
|      C |      .NET 5.0 |      .NET 5.0 | 3.316 ms | 0.0155 ms | 0.0145 ms |
|      D |      .NET 5.0 |      .NET 5.0 | 3.318 ms | 0.0137 ms | 0.0122 ms |
|      A | .NET Core 3.1 | .NET Core 3.1 | 3.164 ms | 0.0091 ms | 0.0071 ms |
|      B | .NET Core 3.1 | .NET Core 3.1 | 3.354 ms | 0.0255 ms | 0.0226 ms |
|      C | .NET Core 3.1 | .NET Core 3.1 | 3.325 ms | 0.0146 ms | 0.0136 ms |
|      D | .NET Core 3.1 | .NET Core 3.1 | 3.312 ms | 0.0124 ms | 0.0116 ms |

// * Hints *
Outliers
  MyBenchmarks.D: .NET 5.0      -> 1 outlier  was  removed (3.35 ms)
  MyBenchmarks.A: .NET Core 3.1 -> 3 outliers were removed (3.20 ms..3.23 ms)
  MyBenchmarks.B: .NET Core 3.1 -> 1 outlier  was  removed (3.42 ms)

So I'm guessing one of three things may be happening here...

My Count and CountTo methods are too contrived to simulate something that we might want to benchmark in the real world.
The JIT compiler is smart enough to recognise that methods C and D are assigning the result of the CountTo method to a public property, but that nothing else reads the value of that property, and so is considering both the CountTo method and the assignment of the result to the property as dead code.
I'm still new to benchmarking and have completely misunderstood something important.

Which of these is the case?

I'd expect an ahead-of-time compiler to optimize your `CountTo` loop away to just `max(0, limit)`; IDK if C#'s JIT spends enough time looking for loop transformations for that. That is in fact what GCC -O3 does compiling a C version of this for x86-64: https://godbolt.org/z/Ef4bs3PKG. Definitely worth looking at the JIT asm on https://sharplab.io/ or something. — Peter Cordes, Feb 09 '22 at 21:36
The optimizer is just smarter than you guessed. It *inlines* the Count/To method, then eliminates dead code, all that's left is the do-nothing for loop in both cases (it doesn't eliminate empty for loops). https://stackoverflow.com/a/4045073/17034 — Hans Passant, Feb 09 '22 at 21:59
Thanks for the links @PeterCordes. SharpLab looks interesting, but does it actually do any optimising? I notice that if I remove the A method, leaving the Count method as an unused private method, the Count method is still present in the JIT Asm, even when I set the top-right drop-down list to "Release". — sbridewell, Feb 10 '22 at 19:18
AFAIK it does as much optimization as would happen in real life with sufficient warm-up. But I don't really know C# and don't use it or Windows. But you're just saying that there's asm for an unused method; that might be SharpLab intentionally finding all the methods can calling them in warm-up loops to make sure they get fully JIT optimized. That's a pure guess, though. — Peter Cordes, Feb 10 '22 at 19:23

How to avoid elimination of dead code in BenchmarkDotNet?

0 Answers0