Overhead and compiler optimization of dereference of nested struct elements

Question

I'm wondering whether compilers (gcc with -O3 more specifically) can/will optimize out nested struct element dereferences (or not nested even).

For example, is there any point in doing the following code

register int i = 0;
register double multiple = struct1->struct2->element1;
for (i = 0; i < 10000; i++)
  result[i] = multiple * -struct1->struct3->element3[i];

instead of

register int i = 0;
for (i = 0; i < 10000; i++)
  result[i] = struct1->struct2->element1 * -struct1->struct3->element3[i];

I'm looking for the most optimized, but am not going to go through and bring outside of the loop struct dereferences if a compiler will optimize this out. If it does I think my best option is the following

register int i = 0;
register double* R = &result[0];
register double* amount = &struct1->struct3->element[0];
for (i = 0; i < 10000; i++, R++, amount++)
  *R = struct1->struct2->element1 * -*amount;

which eliminates all unnecessary dereferences etc. (I think). Would the 2 deferences to get to element3 be optimized?

Any thoughts? Thanks

score 2 · Accepted Answer · answered Feb 27 '15 at 04:59

2

This optimization is known as Loop-invariant code motion. Loop invariants (things that never change inside the loop) are moved outside of the loop, to avoid re-calculating the same thing over and over.

GCC supports it, and is enabled by the -fmove-loop-invariants flag:

-fmove-loop-invariants Enables the loop invariant motion pass in the new loop optimizer. Enabled at level -O1

Today, compilers are almost always smart enough to do the "right thing" no matter how you formulate your code. Focus on writing the simplest, cleanest, easiest to read (for a human!) code you can. Let the compiler take care of the rest by enabling optimizations. -O2 is commonly used.

answered Feb 27 '15 at 04:59

Jonathon Reinhart

132,704
33
254
328

@James: Also, `register` is really neither necessary nor useful any more. Trust your compiler to figure that out. – rici Feb 27 '15 at 05:04
@rici Really? Hmm ok. I have been going under the process for ages now of chucking register everywhere where it might be needed and if it runs out of registers it'll just go RAM. I'm guessing compilers have come along way! – dogAwakeCat Feb 27 '15 at 05:13
@James Write some functions, and look at the generated assembly with optimizations off, and then again at `-O2`. It's pretty amazing what they can do. – Jonathon Reinhart Feb 27 '15 at 05:30
@JonathonReinhart yeh, I'll have to do that and reconsider my position on delegation to the compiler. Thanks! – dogAwakeCat Feb 27 '15 at 05:46
@James: As of C++11 (at least), Appendix D.2 of the standard: "The use of the register keyword as a storage-class-specifier (7.1.1) is deprecated." Appendix D lists the features which are still "[n]ormative for the current edition of the Standard, but not guaranteed to be part of the Standard in future revisions." – rici Feb 27 '15 at 07:11
@James `register` is ignored by MSVC and most modern compilers http://stackoverflow.com/a/5120057/995714 – phuclv Feb 27 '15 at 08:04
@JonathonReinhart Apologies for dragging this back up. I've just realised, does loop invariance optimising apply to dereferencing replacing my dereferencing with a version of `amount`? For example, does my third case get optimised automagically. I can't find any references to this optimisation determining if a dereference is loop invariant. Lets say (ignoring pointer arithmetic) outside loop `amount = &struct1->struct3->element[0]` then inside loop `amount[i]` -- would this be detected by the optimisation? – dogAwakeCat Apr 09 '15 at 07:56

Overhead and compiler optimization of dereference of nested struct elements

1 Answers1