46

I am curious about the liberties that a compiler has when optimizing. Let's limit this question to GCC and C/C++ (any version, any flavour of standard):

Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

The example I have in mind is printing different bits of text in various constructors in C++ and getting a difference depending on whether copies are elided (though I've not been able to make such a thing work).

Counting clock cycles is not permitted. If you have an example for a non-GCC compiler, I'd be curious, too, but I can't check it. Bonus points for an example in C. :-)

Edit: The example code should be standard compliant and not contain undefined behaviour from the outset.

Edit 2: Got some great answers already! Let me up the stakes a bit: The code must constitute a well-formed program and be standards-compliant, and it must compile to correct, deterministic programs in every optimization level. (That excludes things like race-conditions in ill-formed multithreaded code.) Also I appreciate that floating point rounding may be affected, but let's discount that.

I just hit 800 reputation, so I think I shall blow 50 reputation as bounty on the first complete example to conform to (the spirit) of those conditions; 25 if it involves abusing strict aliasing. (Subject to someone showing me how to send bounty to someone else.)

ks1322
  • 33,961
  • 14
  • 109
  • 164
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 4
    Realize that, in the case of copy elision, the compiler is *specifically allowed* to change the observable behavior of the program. – Robᵩ Jun 15 '11 at 21:14
  • 1
    @Rob: That's fine, I'd be very happy to see just a working example of that. I understand that all constructors are expected to yield semantically identical objects, so by putting print routines in them I'm purposefully introducing a discrimination that the compiler doesn't have to be concerned by. That would be an acceptable example, though! – Kerrek SB Jun 15 '11 at 21:17
  • 1
    "I've not been able to make such a thing work" - maybe in GCC copy ctor elision is enabled even with no optimization? – Steve Jessop Jun 15 '11 at 21:18
  • http://ideone.com/uvcoV - notice how Func1 doesn't invoke the copy constructor, but Func2 does. – Robᵩ Jun 15 '11 at 21:31
  • @Rob: Yes, very nice, but I can't _prevent_ the optimization for Func1 with compiler settings, can I? (Good to know that this sort of code is nicely optimizes, though!) – Kerrek SB Jun 15 '11 at 21:35
  • 2
    @Kerrek: aha! `man` to the rescue, yes you can, with `-fno-elide-constructors`. Assuming that you permit such a fine-grained option as a "different optimization level", I think that fits your requirements. – Steve Jessop Jun 15 '11 at 21:43
  • @Steve: Ooh, will try that with Rob's code ... yes, that actually works! Doesn't seem to be part of any numeric optimization level; probably there's no case where this would be desirable. – Kerrek SB Jun 15 '11 at 21:49
  • Unfortunately what is allowed by the standard and what is reality often differs. E.g. on x86 based architectures it is not a good idea to use std::set and similar constructs, since especially when optimization is enabled, those can behave errenously (i.e. crash). – PlasmaHH Aug 17 '11 at 14:48
  • PlasmaHH: How can a `set` crash? I mean, I know that you cannot compare doubles by binary identity when you make computations, but when you just use them as a write-once-read-many key, what's the problem? – Kerrek SB Aug 17 '11 at 16:48
  • 1
    @Kerrek: at a guess, you could have problem if the set tries to compare a value that has somehow survived in an 80-bit FPU register all the way to its use, with "the same value" that has been stored in memory along the way and hence got truncated. This could lead to an impossibility that the `std::set` implementation didn't anticipate. I'm not sure this is standard-compliant, mind, but PlasmaHH is talking about reality differing from the standard, i.e. a bug, so anything goes really :-) – Steve Jessop Aug 17 '11 at 18:18
  • @Steve: that's definitely an interesting scenario! I wonder if that's been thought about... – Kerrek SB Aug 17 '11 at 18:32
  • Also note that `<` is not necessarily a strict weak order on `double`, because `0 < NaN` and `NaN < 1` are both false, but `1 < 0` is also false. Or to put it another way, `!(X < Y) && !(Y < X)` is not an equivalence relation because it's not transitive for NaNs. So `set` can crash if NaNs get involved even if it's bug-free. – Steve Jessop Aug 17 '11 at 18:32
  • @Steve: Yeah, good point - you definitely cannot have NaN as a key. But as I said, we'd definitely never want to do arithmetic on the key values. At the very best, we consider them purely for storage. (So don't go storing NaNs!) – Kerrek SB Aug 17 '11 at 18:37
  • @SteveJessop Do not use NaN as key is a reasonable constraint. Do you use numbers that cannot be represented exactly in the fp type is not! – curiousguy Oct 23 '17 at 05:11

14 Answers14

19

The portion of the C++ standard that applies is §1.9 "Program execution". It reads, in part:

conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. ...

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. ...

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions. ...

So, yes, code may behave differently at different optimization levels, but (assuming that all levels produce a conforming compiler), but they cannot behave observably differently.

EDIT: Allow me to correct my conclusion: Yes, code may behave differently at different optimization levels as long as each behavior is observably identical to one of the behaviors of the standard's abstract machine.

Community
  • 1
  • 1
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • 1
    How does this square with the invocation of different constructors? If I write `T x /* default */; x = T(2, 3);`, would it be illegal to optimize this to `T x(2, 3);`? – Kerrek SB Jun 15 '11 at 21:32
  • 8
    Note "one of the possible execution sequences". It doesn't have to be the *same* one at each optimization level. So for example the compiler is allowed to evaluate function arguments in one order at -O0 and a different order at -O4. A program whose output differs according to which order actually occurs is valid, although it isn't what the C standard calls "a strictly conforming program". Then again `printf("%d\n", CHAR_BIT)` isn't a strictly conforming program either since the output is implementation-defined. – Steve Jessop Jun 15 '11 at 21:35
  • 2
    @Kerrek: that optimization is illegal for example if the assignment operator `T::operator=(const T &)` has observable side-effects. As you know there's a special rule to allow copy ctor elision even when the copy ctor has side-effects. There's no such rule for copy assignment. Same applies to the default construction, its observable behavior can't be elided either. But as long as nothing affects observable behavior, the compiler can leave out anything it wants (and if `x` is unused, construct no object at all). – Steve Jessop Jun 15 '11 at 21:36
  • @Steve: I didn't actually know about the exception for the copyctor. Does that mean that the indeterminacy in how often something gets copied (or default-constructed) is the _only_ thing that may affect visible behaviour according to the standard? – Kerrek SB Jun 15 '11 at 21:46
  • @Kerrek: you did know, even if you didn't know that it was an explicit special case, since you mentioned copy ctor elision in the question :-). It's not the only thing: anything that is unspecified or implementation-defined can change. The language used is 12.8/15 is "the implementation is allowed to do this", but in effect what it's saying is the same as if it said "it is unspecified whether the implementation does this" – Steve Jessop Jun 15 '11 at 21:51
  • Hmm... there've been many good answers, but I have to go with one of them. Let me accept this one, I hope that won't diminish the value of all the other ones! (And if you tell me how to forward reputation I will send you some for further examples!) – Kerrek SB Jun 15 '11 at 23:08
  • @Rob: I never gave you that reputation. Have it now, with interest. (In 24h, that is.) – Kerrek SB Aug 16 '11 at 22:18
  • @SteveJessop: My first thought on seeing the question was `printf("Hey\n")+printf("There\n");`, though I'm curious about how the standard would regard the conformance of `printf("%d",fgetc(f1)+fgetc(f2));` If f1 and f2 happen to be pipe streams, one could construct scenarios where the sequence in which they are read would be observable, but in most practical circumstances a program which read f1 first would be equivalent to one which read f2 first. – supercat Dec 12 '13 at 21:53
  • @supercat: In C terminology it is conforming but not strictly conforming. A strictly conforming program is one whose observable behavior is necessarily the same in all conforming implementations. Operations with unspecified order and side effects is just one of the many, many things you can do in C that conforms but doesn't strictly conform. `printf("%d\n", CHAR_BIT)` is another. `CHAR_BIT` would never vary with optimization level, of course. The examples with `+` might, but it's not obvious why they would given the symmetry involved. – Steve Jessop Dec 12 '13 at 22:44
  • @Steve: If different functions had been called (instead of printf twice or fgetc twice), it's possible that a one sequence would require the return value of the first function to be saved while calling the other, but flipping them could avoid that step. Otherwise, how would "strictly conforming" be defined with things like volatile variables? Could a program that performs `n=v1+v2;` [v1 and v2 volatile] be strictly conforming, depending upon how v1 and v2 are used elsewhere? Also, does "conforming" imply that a program must work "correctly" in all implementations? What defines "correct"? – supercat Dec 12 '13 at 22:56
  • @supercat: (1) order of access to volatile variables is observable, so sure `v1+v2` is not strictly conforming if `v1` and `v2` are volatile. (2) nothing defines correct, a conforming program is one that's syntactically correct and satisfies semantic constraints (hence has no undefined behavior on any conforming implementation). But since its observable behavior may depend on things that are implementation-defined or unspecified, it doesn't have a single definitive "correct" behavior. – Steve Jessop Dec 13 '13 at 08:15
  • @SteveJessop: When you say satisfies semantic constraints, do you mean "does whatever it's supposed to do"? For example, if the stated purpose of a program is to output every distinct line in the source file once each, in arbitrary order, would a program which might on some implementations output a line twice not be considered conforming even if it never hit any UB, while a program whose output sequence on different implementations could legally vary but which would always contain each distinct line exactly once would be conforming? – supercat Dec 13 '13 at 16:55
  • @supercat: no, I mean the same thing that the standard means by "semantic constraints". For example "it is UB to dereference a null pointer" is a semantic constraint. – Steve Jessop Dec 13 '13 at 17:23
  • @SteveJessop: Hmm... it would seem like it would be useful to have a term which specifies that "A program which is defined as performing some function will be considered ____ if, when fed to any standards-conforming compiler whose implementation limits are not too small, it will perform the indicated function." In some contexts a guarantee that a program is guaranteed not to invoke UB may be highly desirable, but a program whose *only* behavioral guarantee is that it won't invoke UB isn't very useful. Replacing any program with `int main(void) { return 0;}` would satisfy that. – supercat Dec 13 '13 at 17:33
  • I don't think it's any of the standard's business what claims you make about what your code does, so it doesn't define anything like that. I think the term I'd use for that concept, to replace ______, is "correctly documented" :-) Looking from the other direction (the documentation is correct by definition because it's the functional requirements, the question is whether or not the code is right), you'd say "ready to ship". Or "portable". – Steve Jessop Dec 13 '13 at 18:15
16

Floating point calculations are a ripe source for differences. Depending on how the individual operations are ordered, you can get more/less rounding errors.

Less than safe multi-threaded code can also have different results depending on how memory accesses are optimized, but that's essentially a bug in your code anyhow.

And as you mentioned, side effects in copy constructors can vanish when optimization levels change.

Dennis Zickefoose
  • 10,791
  • 3
  • 29
  • 38
  • I'd like to exclude multithreading stuff like that because that's essentially just undefined behaviour -- you're only allowed a multithreading example if it synchronizes properly. I hadn't thought about floating point inaccuracies, let's discount those, too (though that's a fair point). – Kerrek SB Jun 15 '11 at 21:20
  • Compilers aren't allowed to reorder floating point operations unless you use "unsafe" optimization options like `-ffast-math`. (They are allowed to "contract" `a*b + c` into `fma(a,b,c)` though, and do so in practice when targeting a CPU that has FMA instructions.) – Peter Cordes Jun 24 '22 at 20:32
  • Other than that, floating-point rounding differences are usually only a thing on implementations like 32-bit x86, using the x87 FP, where temporaries in register have more precision than `double`. i.e. `FLT_EVAL_METHOD == 2`, except even across statements (not just within one expression) unless you use `gcc -ffloat-store` or `-O0`. See also https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/. Mostly that's a thing of the past, though, with SSE2 for scalar math on x86, like most other ISAs, having deterministic FP with `FLT_EVAL_METHOD == 0`. – Peter Cordes Jun 24 '22 at 20:33
14

Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

Only if you trigger a compiler's bug.

EDIT

This example behaves differently on gcc 4.5.2:

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

Compiled with -O0 creates a program crashing with a segmentation fault.
Compiled with -O2 creates a program entering an endless loop.

BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • Especially in multi-threaded code, very subtle code bugs (that is, code where the result is not well-defined) will show different results with different optimization levels. It may not look like a bug not to mark a variable as volatile, and it may only cause problems at a specific optimization level-- but this is not a compiler bug. – antlersoft Jun 15 '11 at 21:13
  • OK, simple example: If I write `T foo[] = { T(1), T(2), T(3) };`, most compilers will construct the objects in place, but the standard perfectly allows temporary + copy. So if I put a print routine in the copy constructor, I already get different behaviour depending on how this is implemented, without any bugs at all, non? Now I just want something like that dependent on the optimization level. – Kerrek SB Jun 15 '11 at 21:14
  • 10
    Your example is not a compiler bug though, it works fine with optimization because gcc does tail call optimization, it crashes without optimization because you run out of stack space. – nos Jun 15 '11 at 21:19
  • Haha, that's a cool example! Not quite what I had in mind (I'd really like _two_ functioning programs), but that's great! – Kerrek SB Jun 15 '11 at 21:23
  • 3
    @nos: At no point does the OP state that he is looking for compiler bugs. – Dennis Zickefoose Jun 15 '11 at 21:25
  • 6
    The above is definitely not a compiler bug. It sounds like the code is working perfectly in both cases. – Martin York Jun 15 '11 at 22:34
  • Fore more info about this: http://stackoverflow.com/questions/5493688/stack-overflow-error-or-exception/5493734#5493734 – Matteo Italia Jun 15 '11 at 23:24
  • 3
    Not only if you trigger a bug. An example: `int x = printf("hello") + printf("world");` It could print `helloworld` or `worldhello`, and which it does *might* depend on whether optimization is turned on. I.e. optimization could change things where the behavior is *unspecified*. – R.. GitHub STOP HELPING ICE Jun 16 '11 at 02:10
  • 2
    I liked the example as it actually shows a particular optimization *tail recursion* in place. In the first case the optimization is not applied, and it generates a stackoverflow by recursively calling itself, while in the second example the optimizer has converted the recursion into a loop... again **great** example. – David Rodríguez - dribeas Aug 17 '11 at 18:15
11

OK, my flagrant play for the bounty, by providing a concrete example. I'll put together the bits from other people's answers and my comments.

For the purpose of different behaviour at different optimizations levels, "optimization level A" shall denote gcc -O0 (I'm using version 4.3.4, but it doesn't matter much, I think any even vaguely recent version will show the difference I'm after), and "optimization level B" shall denote gcc -O0 -fno-elide-constructors.

Code is simple:

#include <iostream>

struct Foo {
    ~Foo() { std::cout << "~Foo\n"; }
};

int main() {
    Foo f = Foo();
}

Output at optimization level A:

~Foo

Output at optimization level B:

~Foo
~Foo

The code is totally legal, but the output is implementation-dependent because of copy constructor elision, and in particular it's sensitive to gcc's optimization flag that disables copy ctor elision.

Note that generally speaking, "optimization" refers to compiler transformations that can alter behavior that is undefined, unspecified or implementation-defined, but not behavior that is defined by the standard. So any example that satisfies your criteria necessarily is a program whose output is either unspecified or implementation-defined. In this case it's unspecified by the standard whether copy ctors are elided, I just happen to be lucky that GCC reliably elides them pretty much whenever allowed, but has an option to disable that.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • Nooo, wait, I really want to give the bounty to Rob, I just had to wait 24hrs with it, and I feel bad for not having done it back when I first asked the question... is that OK? I can give you bounty on some other question if you like! Copy-constructor elision is definitely one of the most interesting aspects of optimization, and that GCC option is a cool find. I would happily split the bounty if that's an option... – Kerrek SB Aug 17 '11 at 18:48
  • Hmm, you've certainly made most of the constructive comments to this question, so you definitely deserve the moral bounty. Quite a dilemma, I wish there were a way to pay tribute rep points that are not limited to one user per question. – Kerrek SB Aug 17 '11 at 18:50
  • @Kerrek: no worries, I'm sure you've upvoted plenty of my answers elsewhere and given me rep that way, we both hang around the C++ questions enough. Sorry, I didn't realise that you created the bounty *because* you had in mind to award it to Rob's answer. I just spotted the question under the "featured" tab and recognized it, so I knew I could give a bounty-hunting answer without effort. I totally don't "deserve" a bounty just for bothering to type out the code and run it to confirm the result I expected, on most questions I do that for normal upvotes :-) – Steve Jessop Aug 17 '11 at 18:53
  • Cool, and I'm sure we'll come up with another opportunity before long :-) (I think I still have one unanswered question out, for that matter.) – Kerrek SB Aug 17 '11 at 18:57
  • You do have an unanswered question, but I don't know the answer so that's no use. This is why I don't win bounties, I do enough hard work on my job without going beyond the low-hanging fruit and into real research and testing to answer SO questions ;-) – Steve Jessop Aug 17 '11 at 19:06
  • Hehe, who doesn't? :-) Yeah, that question is a bit of a dud, I don't really expect a working solution from that. ("Detours" is a fine start, but I'd need to know further, disparate Windows internals to achieve my goal.) For that matter, I don't think any of the bountied questions on SO are pleasant, generally useful language questions, they're always about some ugly, overly specific three-library-plus-widget problem... – Kerrek SB Aug 17 '11 at 19:12
  • Ah, I can award multiple bounties :-) – Kerrek SB Aug 18 '11 at 00:28
  • I would argue that `-fno-elide-constructors` is not an optimization option, it's a debug option, or at a push a pessimization option! That option is only necessary for people who don't realize the standard allows copy elision (and so their program's "correctness" relies on side effects which may not happen) or who are trying to understand copy elision (and so want to experiment to see where it does/doesn't happen). – Jonathan Wakely Mar 05 '18 at 22:30
8

For C, almost all operations are strictly defined in the abstract machine and optimizations are only allowed if the observable result is exactly that of that abstract machine. Exceptions of that rule that come to mind:

  • undefined behavior don't has to be consistent between different compiler runs or executions of the faulty code
  • floating point operations may cause different rounding
  • arguments to function calls can be evaluated in any order
  • expressions with volatile qualified type may or may not be evaluated just for their side effects
  • identical const qualified compound literals may or may be not folded into one static memory location
Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
5

Anything that is Undefined Behavior according to the standard can change its behavior depending on optimization level (or moon-phase, for that matter).

ninjalj
  • 42,493
  • 9
  • 106
  • 148
  • Yes, true. Do let us exclude anything undefined. I would like a well-defined, standard-conforming program which compiles correctly in all optimization levels and gives correct, deterministic programs but which behave differently. – Kerrek SB Jun 15 '11 at 21:24
  • 2
    @Kerrek SB: How about unspecified behavior? Given a function call like `f(a(x), b(y))`, we know that `a(x)` is evaluated either before or after `b(y)`, but not concurrently. – David Thornley Jun 15 '11 at 21:35
  • @David: Ah, evaluation order, I see. Hm... is that affectable by optimisation? I'd say let's discount it because it's already non-deterministic within the standard, but if you have a working example, let's see it. – Kerrek SB Jun 15 '11 at 22:05
  • @Kerrek: the standard doesn't say anything about optimization levels or what they mean. The *only* room for code to behave differently at different optimization levels, is for the implementation to make different decisions about things that the standard doesn't wholly pin down. So any example, including the copy elision one, is bound to be an example of something that's either implementation-defined, unspecified, or undefined behavior. – Steve Jessop Jun 16 '11 at 01:40
3

Since copy constructor calls can be optimized away, even if they have side effects, having copy constructors with side-effects will cause unoptimized and optimized code to behave differently.

Kleist
  • 7,785
  • 1
  • 26
  • 30
  • Yeah, that was my idea to begin with, it's just that I never managed to actually get different behaviour by varying the optimization level. Steve Jessop discovered `-fno-elide-constructors` though, which does the trick. – Kerrek SB Jun 15 '11 at 22:04
2

The -fstrict-aliasing option can easily cause changes in behavior if you have two pointers to the same block of memory. This is supposed to be invalid but is actually quite common.

pts
  • 80,836
  • 20
  • 110
  • 183
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • Yeah, I figured that... did that ever happen to you in reality, though? – Kerrek SB Jun 15 '11 at 21:19
  • @Kerrek: it shouldn't be too hard to come up with an example, I think I've managed it before. Start with `float f = 0; int &a = *(int*)(&f); int b = a; f = 1; int c = a; std::cout << a << " " << b << " " << c << "\n";`, then play around with it until you can get the compiler to order the operations differently at different optimization levels, and/or entirely omit one or both assignments to `f` at some levels but not others. The optimizer is entitled to assume that `f = 1;` doesn't alter the referand of `a`, so you just need to catch it doing so. – Steve Jessop Jun 15 '11 at 21:22
  • @Kerrek, sorry I don't remember a specific instance. And if I did run into it, it probably wasn't with gcc. There's gotta be a reason it sticks with me though. – Mark Ransom Jun 15 '11 at 21:40
  • @Steve: Interesting. No luck with that yet, but I'll try and tweak the values. – Kerrek SB Jun 15 '11 at 21:44
2

This C program invokes undefined behavior, but does display different results in different optimization levels:

#include <stdio.h>
/*
$ for i in 0 1 2 3 4 
    do echo -n "$i: " && gcc -O$i x.c && ./a.out 
  done
0: 5
1: 5
2: 5
3: -1
4: -1
*/

void f(int a) {
  int b;
  printf("%d\n", (int)(&a-&b));
}
int main() {
 f(0);
 return 0;
}
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • 1
    If you invoke undefined behaviour, I would say the natural thing is to expect different outcomes. Note that the difference in the output does not have to be even due to the optimization level. By invoking undefined behaviour, each run of the program is allowed to produce a different output. – Schedler Jun 17 '11 at 16:44
  • Granted, and worth mentioning. But so far no one has produced an example that varies according to optimization level and *doesn't* require UB. – Robᵩ Jun 17 '11 at 16:47
  • So the idea here is to print addresses of variables? Interesting, that's certainly something that can vary... hadn't thought of that. Cheers. – Kerrek SB Jun 17 '11 at 17:32
  • @Kerrek - No, the idea is to print the difference of addresses of varables. Specifically, it measures whether the function had a frame pointer, or perhaps was inlined. – Robᵩ Jun 17 '11 at 19:23
  • 2
    +1 for this one. Although formally it's undefined behavior and not allowed under the stricter version of the question, in practice most implementations do pretty much define the results of pointer arithmetic on objects that aren't both subobjects of some common object. So that's almost, but not quite, implementation-defined, and then the actual relative position of `a` and `b` is unspecified. Anyway, it's a fairly ingenious answer that on pretty much any implementation isn't UB for that implementation, even though it's UB for the standard. – Steve Jessop Aug 17 '11 at 19:02
1

gcc defines __OPTIMIZE__ macro when non-zero optimization level is used. You can use it like below:

#ifdef __OPTIMIZE__
printf("Code compiled with -O1 or higher\n");
#else
printf("Code compiled with -O0\n");
#endif
Daniel Frużyński
  • 2,091
  • 19
  • 28
0

same source code like source code

before enable -finline-small-functions and after enable -finline-small-functions

Before enable -finline-small-functions

After enable -finline-small-functions

-finline-small-functions can be enabled in -O2/-O3

Shouguo
  • 31
  • 2
  • only focus on function main and _Z1Av _Z1Bv _Z1Cv function – Shouguo Nov 10 '18 at 05:15
  • 1
    Firstly, the code should be included as text. (Because we can't copy code from a picture to compile it, and also because some of us rely on screen readers.) – HolyBlackCat Nov 10 '18 at 05:28
  • Can you add some explanations to you answer? I assume the second picture shows two call graphs? Is the observable behavior of the code with and without `-finline-small-functions` different? – HolyBlackCat Nov 10 '18 at 05:30
0

Two different C programs:

foo6.c

void p2(void);

int main() {
    p2();
    return 0;
}

bar6.c

#include <stdio.h>

char main;

void p2() {
    printf("0x%x\n", main);
}

When both modules are compiled into one excecutable with optimization levels one and zero, they print out two different values. 0x48 for -O1 and 0x55 for -O0

Screenshot of terminal

Here is an example of it working in my environment

  • 2
    Is it undefined behavior? I don't think undefined behavior really count. If it isn't, then nice complementary answer. – Guillaume Racicot Mar 20 '19 at 18:43
  • @GuillaumeRacicot I think it is defined behavior, check out the screenshot I just put in the post. – Alec Deshaies Mar 21 '19 at 20:13
  • I don't think this is exactly how it works. Undefined behavior is specified in the standard. Your implementation could choose to act like you expect. Or it could act differently depending on the optimization level. Or it could crash. I don't know if your example is defined or undefined behavior. Simply running it is not enough. Only the rule can say at this point. – Guillaume Racicot Mar 21 '19 at 20:43
0

a.c:

char *f1(void) { return "hello"; }

b.c:

#include <stdio.h>

char *f1(void);

int main()
{
    if (f1() == "hello") printf("yes\n");
        else printf("no\n");
}

Output depends on whether merge string constants optimization is enabled or disabled:

$ gcc a.c b.c -o a -fno-merge-constants; ./a
no
$ gcc a.c b.c -o a -fmerge-constants; ./a
yes

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
-1

Got some interesting example in my OS course today. We analized some software mutex that could be damaged on optimization because the compiler does not know about the parallel execution.

The compiler can reorder statements that do not operate on dependent data. As I already statet in parallelized code this dependencie is hidden for the compiler so it could break. The example I gave would lead to some hard times in debugging as the threadsafety is broken and your code behaves unpredictable because of OS-scheduling issues and concurrent access errors.

  • Was it a bug or was it expected behaviour in light of the optimization steps? – Kerrek SB Jun 15 '11 at 21:11
  • 1
    It is definitely a bug as the mutex does not work any longer. But it is not the compilers fault though as it is not designed to recognize parallel access. Its the programmers responsibility to mark this for the compiler with the keyword volatile. Seen as such I would not call it bug anymore. – Nobody moving away from SE Jun 15 '11 at 21:15
  • volatile isn't enough. C++0x and C1x introduce atomic types because of that. – ninjalj Jun 15 '11 at 21:44