Your proposed optimization really is a combination of two individual simpler transformations. First is pulling the member access outside the loop. From
for(int i = 0; i != 10; i++)
{
var localVar = this.memberVar;
if(localVar)
DoStuff();
else
DoOtherStuff();
}
to
var localVar = this.memberVar;
for(int i = 0; i != 10; i++)
{
if(localVar)
DoStuff();
else
DoOtherStuff();
}
The second is interchanging the loop condition with the if condition. From
var localVar = this.memberVar;
for(int i = 0; i != 10; i++)
{
if(localVar)
DoStuff();
else
DoOtherStuff();
}
to
var localVar = this.memberVar;
if (localVar) {
for(int i = 0; i != 10; i++)
DoStuff();
}
else {
for(int i = 0; i != 10; i++)
DoOtherStuff();
}
The first one is influenced by readonly
. To do it, the compiler has to prove that memberVar
cannot change inside the loop, and readonly
guarantees this1 -- even though this loop could be called inside a constructor, and the value of memberVar
could be changed in the constructor after the loop ends, it cannot be changed in the loop body -- DoStuff()
is not a constructor of the current object, neither is DoOtherStuff()
. Reflection does not count, while it may be possible to use Reflection to break invariants, it isn't permitted to do so. Threads do count, see footnote.
The second is a simple transformation but a more difficult decision for the compiler to make, because it's difficult to predict whether it will actually improve performance. Naturally you can look at it separately by doing the first transformation on the code yourself, and seeing what code is generated.
Perhaps a more important consideration is that in .NET, the optimization pass takes place in between MSIL and machine code, not during compilation of C# to IL. So you cannot see what optimizations are being done by looking at the MSIL!
1 Or does it? The .NET memory model is considerably more forgiving than e.g. the C++ model where any data race leads very quickly to undefined behavior unless the object is defined volatile
/atomic. What if this loop runs in a worker thread spawned from the object constructor, and after spawning the thread, the constructor goes on (which I'll call the "second half") to change the readonly
member? Does the memory model require that change to be seen by the worker thread? What if DoStuff()
and the second half of the constructor force memory fences, for example access other members which are volatile
, or take a lock? So readonly
would only allow the optimization in a single-threaded environment.