How do compilers optimize our code?

Question

I ran into this question when i was answering another guys question. How do compilers optimize the code? Can keywords like const, ... help? Beside the fact with volatiles and inline functions and how to optimize the code all by your self!

This is my second day in stackoverflow, but I see the const optimization topic for the third time already... Why are programmers so obsessed with optimization especially when in most cases it is not needed? This is bound to be a dupe — Armen Tsirunyan, Oct 10 '10 at 19:58
Very interesting question. Try reading http://en.wikipedia.org/wiki/Compiler_optimization — usr-local-ΕΨΗΕΛΩΝ, Oct 10 '10 at 20:01
@ djechelon: Thanks :D, I already have done that. I mean the very specific cases. You know, the cases which are concerned mostly with gcc. — Amir Zadeh, Oct 10 '10 at 20:06
@Armen - May I ask why you believe optimization is 'not needed'? — linuxuser27, Oct 10 '10 at 20:07
@linuxuser27- If you're creating a calculator app that adds 1+1, do you really need to optimize? The answer will show in less than a blink of an eye anyways- this is what cheap calculators do- use the slowest processor that can still produce the answer in an unnoticeable amount of time. Of course, optimization is needed in _some_ situations, just not most of them. I do believe optimization is something some programmers like to do though. — Dominic K, Oct 10 '10 at 20:11
If you're really interested, head on over to a large library (university library is most likely to have) and pick up a copy of "The Dragon Book" (http://en.wikipedia.org/wiki/Dragon_Book_%28computer_science%29). Any edition will do. Even if you're not interested in writing a compiler, I guarantee that you will learn *something* from this book. — Tergiver, Oct 10 '10 at 20:14
@DMan - Agreed of course there are places where optimization will not occur. But that was not the statement. It was 'in most cases it is not needed'. I find this kind of naive. If compilers did not optimize code most programs would run considerably slower. So my question is why does one state that in 'most cases' optimization is not needed? — linuxuser27, Oct 10 '10 at 20:14
Also here is a good link i found, which is not my answer but, it is very great: http://www.strchr.com/what_your_compiler_can_do_for_you — Amir Zadeh, Oct 10 '10 at 20:18
@linuxuser27: I think what @Armen was trying to say was "I most cases, [programmer-guided] optimization is not needed." Really, choosing the simplest and cleanest approach and algorithm for a problem has far more beneficial results than trying to squeeze an extra instruction through the CPU. — GManNickG, Oct 10 '10 at 20:22
Amen @Armen ;) For some applications, optimizing the bottlenecks is viable. But that should at most be 1% percent of the programming. Yet still, it feels like 10% of the questions at SO go like "is this fast than this" or "will this be optimized", in only 15% of the cases the code in question might actually be a bottleneck and at most in 5% the OP actually *knows* it's a bottleneck. The cases where it's actually relevant is another order of magnitude lower. It's horrible. — , Oct 10 '10 at 20:26
GMan answered for me :) Always take heed of the two rules of optimization — Armen Tsirunyan, Oct 10 '10 at 20:26
@GMan and Armen - I see. I completely agree with you then. Manual optimizations are likely not need most of the time. — linuxuser27, Oct 10 '10 at 22:04
@linuxuser27: I'm sort of surprised at the idea that manual optimization is not needed. For little micro-optimizations I can see that, but there's a wide world of macro optimization, where the stakes are much higher, and only the programmer can do it. And it's not just about the "right algorithm". It's about sweeping away complexity. Example: 43x speedup - http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773 — Mike Dunlavey, Oct 11 '10 at 23:36

score 13 · Accepted Answer · answered Oct 10 '10 at 20:02

13

Compilers are free to optimize code so long as they can guarantee the semantics of the code are not changed.

I would suggestion starting at the Compiler optimization wikipedia page as there are many different kinds of optimization that are performed at many different stages.

As you can see, modern compilers are very 'smart' at optimizing code (compiled C code is often faster than hand-written assembly unless the programmer really knows how to take advantage of all the specific processor instructions and quirks). As others have said, write for clarity first based on a good design.

answered Oct 10 '10 at 20:02

compilers can do that if they know what hardware their code is running in! sometimes should the GPU come in, they mess up. I recently wrote a code which used both cpu and gpu (cuda) and the bug was just that simple O2 optimization. When i turned it off, all things made sense. – Amir Zadeh Oct 10 '10 at 20:13
@Green Code: Compilers are software too, so of course they are sometimes buggy. But for mature compilers, the output is usually correct and blazing fast compared to whatever most programmers could write on their own. – Oct 10 '10 at 20:30
@Green Code: there are various level of optimizations. Some are machine independent, some not. Machine dependent may not be bug-less, especially for young architecture, simply because they haven't been tested extensively yet. CUDA brings a new difficulty too: suddenly there are parts of the code that should be optimized for CPU and others for GPU. None of the C++ compilers that I know of have been meant to optimize for two different architectures at once. – Matthieu M. Oct 11 '10 at 06:36
Compiler's aren't that smart. They can't even re-order a simple Boolean expression like (A && B && C && D) to optimize the short-circuiting. They play it safe, because they can't determine whether methods have side effects, and since they can't determine that, they can't safely re-order the expression. Since they aren't doing so, they aren't even bothering to calculate operand complexity at compile-time, nor do they profile the cost of evaluating the OPs to see which ones best trigger short-circuits and which ones are expensive and better to be short-circuited (i.e. ordered last). – Triynko Dec 01 '11 at 21:52

Alex Reece · Answer 2 · 2010-10-10T21:45:17.947

6

One very big thing you can do ( beyond what the compiler can do for you ) is to be aware of the cache. Since accessing the memory is really time expensive, the cache tries to help you by storing not only the data you accessed it but the nearby elements as well. This is why foo will run so much faster than bar:

array[ NUM_ROWS ][ NUM_COLS ];

foo() 
{
    int row, col;
    int sum = 0;

    // accesses the elements in the array continuously
    for ( row = 0; row < NUM_ROWS ; row++ ) 
    {
         for ( col = 0; col < NUM_COLS; col++ )
         {
              sum += array[ row ][ col ];
         }
    }
}

bar() 
{
    int row, col;
    int sum = 0;

    // skips from row to row ( big jumps that might miss the cache )
    for ( col = 0; col < NUM_COLS ; col++ ) 
    {
         for ( row = 0; row < NUM_ROWS; row++ )
         {
              sum += array[ row ][ col ];
         }
    }
}

Edit: Another thing to be aware of is repeated string concatenation. Done wrong, this can make code that otherwise seems to run in O( n ) actually be in O( n^2 ) - see an article on Joel on Software

Edit: s/disk/memory/

edited Oct 10 '10 at 21:45

answered Oct 10 '10 at 19:59

Alex Reece

1,906
22
31

1

Funny thing you should care about such low-level things and still write row++ instead of ++row :P – Armen Tsirunyan Oct 10 '10 at 20:01
4

What does your example code have to do with accessing the disk? – JeremyWeir Oct 10 '10 at 20:02
4

being aware of the cache will save you a lot more than ++row :P @jayrdub: that involves an explanation of how memory *actually* works in the machine. Basically, `array[ row ][ col ]` is a call to main memory which is originally stored on the hard disk. Because the hard disk moves so much slower than the CPU, computers will store information in a 'cache' where it is easier to access. – Alex Reece Oct 10 '10 at 20:02
4

I think you have a weird computer – JeremyWeir Oct 10 '10 at 20:06
2

@Armen Tsirunyan - The value of the expression `row++` is not used here. Your compiler will likely generate identical code for `++row` and `row++` in such a case case. – asveikau Oct 10 '10 at 20:06
@jayrdub you should look at some assembly when you get a chance. – Alex Reece Oct 10 '10 at 20:08
1

@Alex: Tanx Alex, this was the answer i gave to the guy who i answered his question. Other stuff like this are hidden in our code and we don't notice. I really appreciate experiences :D. – Amir Zadeh Oct 10 '10 at 20:09
2

@Alex: For most cases (i.e. those not involving array a large fraction of the size of main memory) the cache that is at issue when dealing with row-major versus column-major array access schemes is one or more levels of the *CPU* cache and has nothing to do with the *disk* cache. – dmckee --- ex-moderator kitten Oct 10 '10 at 20:10
1

@Alex Reece - @jayrdub is right, this is memory, not disk. That said, memory and disk can be equivalent, eg. if this part of memory is not paged in and the kernel will have to go to disk to retrieve it. – asveikau Oct 10 '10 at 20:12
1

So you're saying accessing column-major vs row-major won't increase my L1 cache hit rate? – Alex Reece Oct 10 '10 at 20:20
3

@Alex Reece - I don't think anyone is saying that. The objection is that you say "disk" when you are referring to memory. – asveikau Oct 10 '10 at 20:47
1

@Alex Reece: One of the most common optimization is turning the loops inside out to increase cache hit. It is likely that the optimizer would transform the column-major version into a row-major if it detects that (given the size) this should increase the cache hit. – Matthieu M. Oct 11 '10 at 06:30
2

@Alex Reece: Wikipedia's article: http://en.wikipedia.org/wiki/Loop_interchange – Matthieu M. Oct 11 '10 at 06:33
@Matthieu Dang, thats nice. What compilers do that? – Alex Reece Oct 11 '10 at 14:09
3

@Alex: gcc (-floop-interchange) see http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, and I think it's the one called loop-rotate in llvm: http://llvm.org/docs/Passes.html#loop-rotate – Matthieu M. Oct 11 '10 at 18:47

score -5 · Answer 3 · edited Oct 10 '10 at 20:08

-5

The rules of optimization:

Don't do it
Advanced Users Only: Don't do it yet

Edit: The quote (and other information, useful or not) can be found in the CodingHorror article: Hardware is cheap, programmers are expensive. It would be nice to find the 'origin' of this phrase/quote though.

edited Oct 10 '10 at 20:08

answered Oct 10 '10 at 20:02

Winston Ewert

44,070
10
68
83

2

The question is not about manual optimization, but what compilers may do and how they do it. – DerKuchen Oct 10 '10 at 20:06
Hi, Thank you for your advice, but i guess i am on the edge of learning it completely so i don't wanna misunderstand it. – Amir Zadeh Oct 10 '10 at 20:07
@DerKuchen: well, it's about both of those. – Michael Petrotta Oct 10 '10 at 20:11

How do compilers optimize our code?

3 Answers3