8

Let's say I have declared in the global scope:

const int a =0x93191;

And in the main function I have the following condition:

if(a>0)
    do_something

An awkward thing I have noticed is that the RVDS compiler will drop the if statement and there is no branch/jmp in the object file.

but If I write:

if(*(&a)>0)
    do_something

The if (cmp and branch) will be in the compiled object file.


In contrast, GCC do optimizes both with (-O1 or -O2 or -O3) :

#include <stdio.h>
const a = 3333;

int main()
{
    if (a >333)
        printf("first\n");

return 0;
}

compiled with -O3:

(gdb) disassemble main
Dump of assembler code for function main:
0x0000000100000f10 <main+0>:    push   %rbp
0x0000000100000f11 <main+1>:    mov    %rsp,%rbp
0x0000000100000f14 <main+4>:    lea    0x3d(%rip),%rdi        # 0x100000f58
0x0000000100000f1b <main+11>:   callq  0x100000f2a <dyld_stub_puts>
0x0000000100000f20 <main+16>:   xor    %eax,%eax
0x0000000100000f22 <main+18>:   pop    %rbp
0x0000000100000f23 <main+19>:   retq   
End of assembler dump.

And for

#include <stdio.h>
const a = 3333;

int main()
{
        if (*(&a) >333)
                printf("first\n");

return 0;
}

will give:

(gdb) disassemble main
Dump of assembler code for function main:
0x0000000100000f10 <main+0>:    push   %rbp
0x0000000100000f11 <main+1>:    mov    %rsp,%rbp
0x0000000100000f14 <main+4>:    lea    0x3d(%rip),%rdi        # 0x100000f58
0x0000000100000f1b <main+11>:   callq  0x100000f2a <dyld_stub_puts>
0x0000000100000f20 <main+16>:   xor    %eax,%eax
0x0000000100000f22 <main+18>:   pop    %rbp
0x0000000100000f23 <main+19>:   retq   
End of assembler dump.

GCC treat both as same (as should be) and RVDS doesn't ?


I tried to examine the affect of using volatile and in the RVDS it did drop the the if(a>333) but gcc didn't:

#include <stdio.h>
volatile const a = 3333;

int main()
{
    if (a >333)
        printf("first\n");

return 0;
}

(gdb) disassemble main
Dump of assembler code for function main:
0x0000000100000f10 <main+0>:    push   %rbp
0x0000000100000f11 <main+1>:    mov    %rsp,%rbp
0x0000000100000f14 <main+4>:    cmpl   $0x14e,0x12a(%rip)        # 0x100001048 <a>
0x0000000100000f1e <main+14>:   jl     0x100000f2c <main+28>
0x0000000100000f20 <main+16>:   lea    0x39(%rip),%rdi        # 0x100000f60
0x0000000100000f27 <main+23>:   callq  0x100000f36 <dyld_stub_puts>
0x0000000100000f2c <main+28>:   xor    %eax,%eax
0x0000000100000f2e <main+30>:   pop    %rbp
0x0000000100000f2f <main+31>:   retq   
End of assembler dump.

Probably there are some bugs in the compiler version I used of RVDS.

0x90
  • 39,472
  • 36
  • 165
  • 245
  • 4
    What compiler and what compiler options? – Mats Petersson Jun 20 '13 at 17:13
  • 1
    @0x90 `armvct` I haven't heard of that compiler before? Can you post the link so that we can better mock them? – Mikhail Jun 20 '13 at 17:15
  • GCC 4.5.3 optimizes out the branch in both cases at `-O1` and higher for me. – Adam Rosenfield Jun 20 '13 at 17:16
  • @Mikhail I meant `rvds`, sorry for the typo. – 0x90 Jun 20 '13 at 17:29
  • I understand the answer, but this question isn't making much sense to me - it says there is a failure to optimize, then both disassemblies produce the same output - then the last bold line. Further, the compiler mentioned in the comments isn't what is being shown in the question. what is going on? – im so confused Jun 20 '13 at 17:34
  • The gcc example shown that if you enable optimization it will drop both `if(a)` and `if(*(&a))`, but in rvds It drop only the first one and the second one he left as his. That is the awkward thing. why rvds doesn't treat both as identical and gcc does ? – 0x90 Jun 20 '13 at 17:39
  • Seems like the conclusion is that "it's a compiler bug" - despite my answer. – Mats Petersson Jun 20 '13 at 17:47
  • Why are you comparing an *ARM* assembler to an *x86-gcc*? Did you try `static const int a =0x93191;`? It is possible for something to change this depending on your platform. See [const volatile](http://stackoverflow.com/questions/4592762/difference-between-const-const-volatile). Making it *static* should make it clear to the compiler that it can be optimized. – artless noise Jun 20 '13 at 19:27
  • 2
    Why you don't publish ARM assembly dump as well? Where is the output from RVDS? This doesn't seem to relate to ARM except the word RVDS. – auselen Jun 20 '13 at 21:45
  • @artlessnoise why noting it as static will make it clear to compiler it can be optimized ? – 0x90 Jun 22 '13 at 07:44
  • 2
    Another module can take the address of the `const` and change it. In C++, there is no global `const`, but in *C* there is. Ie, the compiler doesn't have to allocate space for a variable. For instance, a `const` hardware register may mean it is read-only to software; the behaviour makes more sense with `extern const`. Some people may conclude by reading the standard that a compiler should treat these the same. As the compiler needs to optimize for all these cases, the information `(*(&a))` may not make it through to the optimizer phase, especially as you gave the variable global scope. – artless noise Jun 22 '13 at 15:58
  • @artlessnoise can you add your thoughts as an answer ? thanks – 0x90 Jun 22 '13 at 16:38

2 Answers2

11

The level of complexity the compiler will go through to find out "is this something I can figure out what the actual value is", is not unbounded. If you write a sufficiently complex statement, the compiler will simply say "I don't know what the value is, I'll generate code to compute it".

This is perfectly possible for a compiler to figure out that it's not going to change. But it's also possible that some compilers "give up" in the process - it may also depends on where in the compilation chain this analysis is done.

This is probably a fairly typical example of "as-if" rule - the compiler is allowed to perform any optimisation that generates the result "as-if" this was executed.

Having said all that, this should be fairly trivial (and as per comments, the compiler should consdier *(&a) the same as a), so it seems strange that it then doesn't get rid of the comparison.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • 6
    Undoubtedly this is a failure of the compiler to recognize an optimization opportunity, but it is not as simple as crossing some threshold of complexity in the compiler. `*(&a)` is not only a very simple expression, but it is explicitly called out in a footnote in the C standard as equal to `a` (Note 83 in the 1999 standard, 84 in 1999 TC2 draft n1124, 102 in 2011). We know this compiler optimizes `a > 0` given the visible value of `a`. So the fact that it fails to optimize `*(&a) > 0` suggests it has missed a clear and explicitly stated aspect of the C semantics. – Eric Postpischil Jun 20 '13 at 17:23
  • 1
    I am down-voting this post because something else is going on. No reasonable compiler would quit. – Mikhail Jun 20 '13 at 17:44
  • @EricPostpischil: In C++ it is not exactly the same. Integral constants are slightly different in C and C++. In the case of C++, `a` does not constitute *odr-use* of the constant `a`, while `*&a` is *odr-use*, for example. While in the example code this does not matter too much (the symbol **is** defined), it could matter for static members. – David Rodríguez - dribeas Jun 20 '13 at 19:18
  • @Mikhail A *reasonable* compiler may give up as the variable is not **static**. – artless noise Jun 20 '13 at 19:28
  • @EricPostpischil What the standard says is equivalent doesn't apply to the optimizer. Your statement *We know this compiler optimizes..*, doesn't mean anything when put together with the standard. This seems like a [non sequitur](http://en.wikipedia.org/wiki/Non_sequitur_%28logic%29), the context of the standards seems to be in relation to the `*` operator and doesn't imply anything about code generation. – artless noise Jun 20 '13 at 19:49
  • @artlessnoise: The job of the optimizer is to take advantage of anything it can to improve program execution (or space, depending on the optimization goal). If there is a simple, well-documented property and the optimizer fails to take advantage of it, that is a shortcoming of the optimizer. In other words, the fact that `*&a` and `a` are equivalent does not compel the optimizer to reduce `*&a` to `a`, but that fact combined with the specification that the optimizer should optimize implies it should. In fact, this simple reduction should be performed early in the semantic processing. – Eric Postpischil Jun 20 '13 at 19:55
  • @EricPostpischil Sure. Your previous comment seems to say that it is mandated by the standard. I think we agree; mainly. This applies well to compilers for computing environments. It might not be true that a `const *` will not change ever. A compiler need not even allocate `static const a = 1;`, but it has to have an address if we have `const a;` and take pointers to this. Ie, there is a real physical address. Depending on the tools, some one *could* alter this physical address by having it `const` in one module and not in another; that maybe non-standard, but supported. – artless noise Jun 20 '13 at 20:32
  • @artlessnoise: What does “non-standard, but supported” mean? The C standard does not support defining an object to be const in one place but declaring it non-const in another. Do you mean this particular C implementation supports it? – Eric Postpischil Jun 20 '13 at 20:38
  • @EricPostpischil I mean not all code is 'C'. That is what you open the compiler up to when you don't use `static`. – artless noise Jun 20 '13 at 21:34
4

Optimizations are implementation details of the compilers. It takes time and effort to implement them and compiler writers usually focus on the common uses of the language (i.e. the return of investment of optimizing code that is highly infrequent is close to nothing).

That being said there is a important difference in both pieces of code, in the first case a is not odr-used, only used as an rvalue and that means that it can be processed as a compile time constant. That is, when a is used directly (no address-of, no references bound to it) compilers immediately substitute the value in. The value must be known by the compiler without accessing the variable, since it could be used in contexts where constant expressions are required (i.e. defining the size of an array).

In the second case a is odr-used, the address is taken and the value at that location is read. The compiler must produce code that does those steps before passing the result to the optimizer. The optimizer in turn can detect that it is a constant and replace the whole operation with the value, but this is a bit more involved than the previous case where the compiler itself filled the value in.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489