Possible optimization for compilers or defined behaviour

Question

If there is a function does not take any references or pointers as parameters, its return type is unused, and it makes no calls that observably leave the system (I/O calls, change system time, etc), is it guaranteed to modify only the class in which it is defined (or nothing at all)?

The only exceptions that I can think of to this rule is something like the following:

void a(int b, int c){
    *((int*)b) = c; }

int main() {
    int d=1;
    a((int)(&d),d+1);
    return 0; }

Is that guaranteed to be defined? I know that int* and int do not have to be the same size, but if they are defined to be the same size, does this have to work, or is it still undefined behavior?

The goal is to see if a function can be legally optimized out (i.e. if you can prove that it has no side effects, it can be removed).

See: [wikipedia's dead code elimination](http://en.wikipedia.org/wiki/Dead_code_elimination) and [SO's dead code](http://stackoverflow.com/questions/4813947/how-can-i-know-which-parts-in-the-code-are-never-used). — artless noise, Apr 10 '13 at 00:03
@artlessnoise this is more about something like http://stackoverflow.com/questions/15825188/removing-useless-lines-from-c-file . Also, this is not dead code, this is live code that does nothing. — soandos, Apr 10 '13 at 00:08
No, I think (correct me if I am wrong) you can't optimize it out. Think about b and c being instances of classes having shared pointers, which could be affected. Even think about the call of destructors that take place when leaving a scope. — rralf, Apr 10 '13 at 00:11
@soandos The wikipedia article doesn't explain this fully, but there are many compiler optimizations which detect *live code that does nothing*. That is dead code. See the reference sections at wikipedia and [google](https://www.google.ca/search?q=ssa+dead+code+elimination). If it isn't covered by compiler research, then I think your question borders on esoteric. — artless noise, Apr 10 '13 at 00:22
@soandos As the compiler doesn't know, what a class does exactly do, I think he will only optimize out things like "int i;i=5;i++" (-O0 does not optimize out, -O1 does). The compiler knows about int and its behaviour, but it doesn't know about more complex "datatypes" like classes. I don't know whether and where this is exactly defined, but I think it strongly depends on the compiler you are using and its optimization strategies. I suppose that the behaviour is not defined through the standard of the language, but through the optimization strategy of the compiler. — rralf, Apr 10 '13 at 00:33
There are other cases where your function will have system effects: access to globals/statics, access to registers, writing to a predefined memory address. I'm sure there are more — SomeWittyUsername, Apr 10 '13 at 01:21

score 1 · Accepted Answer · answered Apr 10 '13 at 04:10

The standard guarantees that reinterpret_cast used to convert from a pointer to a suitable integral type (large enough to hold all values) and back to the original pointer type is guaranteed to produce the same pointer value. So yes, this is guaranteed:

int *p = new int(5);
intptr_t i = reinterpret_cast<intptr_t>(p);
// ...
int *q = reinterpret_cast<int*>(i);

assert(p == q);
*q = 10;
assert(*p == 5);

The compiler is allowed to remove code that has no side effects, but that cannot clearly be determined by inspecting only the function signature. For inline functions, where the compiler has visibility over the code, the compiler stands a chance. For functions defined in a different translation unit things are a bit harder (with link time optimizations it is still feasible if the function is small enough).

Note that this is not just limited to functions that take the arguments by value or const-reference. If the compiler sees a function that modifies an argument by reference, but it can prove that the value of the modified object is never read again, it can theoretically remove the call. On the other hand, I would not bet on the compiler doing this in anything but simple cases.

score 0 · Answer 2 · answered Apr 10 '13 at 00:38

I'd say this falls in the area of "well defined undefined behavior"; it will probably work all the time (assuming that sizeof(int*) == sizeof(int)), but it's technically undefined and there's a real chance that some compiler could totally break it in the future. Another example of this is using a union to re-interpret the bits of say a float as an int.

Also, I'd be totally amiss if I didn't point you in the direction of LLVM's link time optimization. It aims to do exactly what you're talking about at link time. It's awesome and works 'out of the box' on osx. They've also got a great, simple example of how it works: http://llvm.org/docs/LinkTimeOptimization.html

Possible optimization for compilers or defined behaviour

2 Answers2