Compiler optimization of properties that remain static for the duration of a loop

Question

I was reading Improving .NET Application Performance and Scalability. The section titled Avoid Repetitive Field or Property Access contains a guideline:

If you use data that is static for the duration of the loop, obtain it before the loop instead of repeatedly accessing a field or property.

The following code is given as an example of this:

for (int item = 0; item < Customer.Orders.Count; item++)
{
   CalculateTax(Customer.State, Customer.Zip, Customer.Orders[item]);
}

becomes

string state = Customer.State;
string zip = Customer.Zip;
int count = Customers.Orders.Count;
for (int item = 0; item < count; item++)
{
   CalculateTax(state, zip, Customer.Orders[item]);
}

The article states:

Note that if these are fields, it may be possible for the compiler to do this optimization automatically. If they are properties, it is much less likely. If the properties are virtual, it cannot be done automatically.

Why is it "much less likely" for properties to be optimized by the compiler in this manner, and when can one expect for a particular property to be or not to be optimized? I would assume that properties where additional operations are performed in the accessors are harder for the compiler to optimize, and that those that only modify a backing field are more likely to be optimized, but would like some more concrete rules. Are auto-implemented properties always optimized?

score 4 · Answer 1 · answered Jan 29 '16 at 20:11

4

Why is it "much less likely" for properties to be optimized by the compiler in this manner, and when can one expect for a particular property to be or not to be optimized?

Properties are not always just wrappers for a field. If there is any degree of logic in a property, it becomes significantly more difficult for a compiler to prove that it is correct to re-use the value it first got when the loop began.

As an extreme example, consider

private Random rnd = new Random();
public int MyProperty
{
    get { return rnd.Next(); }
}

answered Jan 29 '16 at 20:11

Eric J.

147,927
63
340
553

And even if it *does* just return the value of a backing field, the compiler would need to prove that the backing field can't ever change throughout the loop, and that's a hard (often impossible) thing to prove in most cases. – Servy Jan 29 '16 at 20:13
@Servy: Yes, but I believe that is approximately the same case as `Note that if these are fields, it may be possible for the compiler to do this optimization automatically` – Eric J. Jan 29 '16 at 20:14
If the compiler can prove that the property *only* returns the value of a field, *then* its the same. It may or may not know if the property does just that. – Servy Jan 29 '16 at 20:17
@EricJ. this certainly makes sense. There is a problem with the extreme example though, in that it's not a piece of code that this optimization applies to (the coding guideline requires it to be a property that remains static for the loop duration, otherwise there's nothing for the developer to do here). – Owen Pauling Jan 29 '16 at 20:22
1

@OwenPauling: The issue is for the compiler to *prove* that it stays static for the duration of the loop. The compiler would have to understand whether or not rnd.Next() will remain static for the duration of the loop (humans easily know that it does not, but in the general case that's a hard task for the compiler). – Eric J. Jan 29 '16 at 20:48

score 4 · Accepted Answer · edited May 23 '17 at 11:45

It requires the jitter to apply two optimizations:

First the property getter method must be inlined so it turns into the equivalent of a field access. That tends to work when the getter is small and does not throw exceptions. This is necessary so the optimizer can be sure that the getter does not rely on state that can be affected by other code.

Note how the hand-optimized code would be wrong if, say, the Customer.Orders[] indexer would alter the Customer.State property. Lazy code like this is pretty unlikely of course but it's not like this has never been done :) The optimizer has to be sure.

Secondly, the field access code has to be hoisted out of the loop body. An optimization called "invariant code motion". Works on simple property getter code when the jitter can prove that the statements inside the loop body don't affect the value.

The jitter optimizer implements it but it is not stellar at it. In this particular case it is pretty likely that it will give up when it cannot inline the CalculateTax() method. A native compiler optimizes it much more aggressively, it can afford to burn the memory and analysis time on it. The jitter optimizer must meet a pretty hard deadline to avoid pauses.

Do keep the constraints of the optimizer in mind when you do this yourself. Pretty darn ugly bug of course if these methods do have side-effects that you did not count on. And only do this when the profiler told you that this code is on the hot path, the typical ~10% of your code that actually affects the execution time. Low odds here, the dbase query to get customer/order data is going to orders of magnitude more expensive than calculating tax. Luckily code transforms like this also tend to make code more readable so you usually get it for free. YMMV.

A backgrounder on jitter optimizations is here.

Compiler optimization of properties that remain static for the duration of a loop

2 Answers2