Does C++ allow an optimizing compiler to ignore side effects on the for-condition?

Question

While debugging some legacy code I stumbled upon surprising (for me) compiler behavior. Now I'd like to know whether any clause in the C++ spec allows the following optimization, where side effects from a function call on the for-condition are ignored:

void bar() 
{
   extern int upper_bound;
   upper_bound--;
}

void foo()
{
   extern int upper_bound; // from some other translation unit, initially ~ 10
   for (int i = 0; i < upper_bound; ) {
      bar();
   }
}

In the resulting dissambly there is a control path in which upper_bound is preserved in a register and the decrement of upper_bound in bar() never takes effect.

My compiler is Microsoft Visual C++ 11.00.60610.1.

Honestly I don't see much wiggle room in 6.5.3 and 6.5.1 of N3242 but I want to be sure that I'm not missing something obvious.

@Niall everything single threaded. Would have indicated it otherwise. — Daniel Flassig, Sep 26 '14 at 20:09
@ShafikYaghmour I actually doublechecked for that. Both functions (although slightly longer) look innocent... — Daniel Flassig, Sep 26 '14 at 20:18
@DanielFlassig: In that case, can you construct a minimal test-case? — Oliver Charlesworth, Sep 26 '14 at 20:19
@DanielFlassig can you reproduce on [godbolt](http://goo.gl/hWnOgI)? — Shafik Yaghmour, Sep 26 '14 at 20:19
Please see: http://stackoverflow.com/questions/1433204/how-do-i-share-a-variable-between-source-files-in-c-with-extern-but-how — Deduplicator, Sep 26 '14 at 20:20
If there is a matching definition for `upper_bound`, it's a compiler-bug. In the second case you have UB I think, because the compiler need not check whether there is such a definition. — Deduplicator, Sep 26 '14 at 20:26
Compiler and version number please. If your report is accurate, this is a compiler bug. — TonyK, Sep 26 '14 at 20:30
You say there's a control path in the disassembly. It's not a bug unless it affects the visible runtime behavior of a valid program. — Keith Thompson, Sep 26 '14 at 20:32
@DanielFlassig: Please provide some details. Specifically, your compiler + version, a minimal test-case, and the disassembler that you're observing. Without any of these, this question will eventually get closed as "cannot reproduce"! — Oliver Charlesworth, Sep 26 '14 at 20:35
@KeithThompson Well - if it hadn't crashed in flames I would have had a better way to spend my evening ;) — Daniel Flassig, Sep 26 '14 at 20:35
@DanielFlassig: Are you saying there is an error in the visible runtime behavior of a valid program? If so, then (a) you didn't say that in your question, and you should, and (b) you haven't shown us a program that exhibits the problem. If you can provide a [Short, Self Contained, Correct (Compilable), Example](http://sscce.org/), that would be *extremely* helpful. — Keith Thompson, Sep 26 '14 at 20:39
It's Microsoft Visual C++ 11.00.60610.1 compiler. I'm putting together a minimal sample - but I first wanted a second opinion on the spec, just because compiler bugs are a pretty rare thing. — Daniel Flassig, Sep 26 '14 at 20:40
I tested it on GCC 4.8.2 on Linux and it exhibits the reported behaviour. The assembly listing for that exact code shows that the decrementing of upper_bound is optimized away (thus, getting an infinite loop unless upper_bound happens to be 0 when you enter foo()). If you declare a `extern int upper_bound;` in the global scope, the behaviour is "correct" (sets upper_bound to zero and exits the function, which is the optimized observable behaviour of "foo()"). — Mikael Persson, Sep 26 '14 at 21:08
And Clang 3.5 does not optimize away the decrementing of the upper_bound, and is therefore "correct". — Mikael Persson, Sep 26 '14 at 21:10
That seems weird.. Can we see the disassembly (or assembly), please? — drivingon9, Sep 26 '14 at 21:14
@KeithThompson The code given is a short, self-contained and correct example. You can easily compile this and produce the assembly listing for it. What are you talking about? When inspecting such optimization issues, it's all about inspecting the assembly that the optimizing compiler produces. Nobody needs a "runnable" example, which really just complicates things because adding print outs and library includes just adds a lot of noise in the assembly listings, it's not helpful at all. — Mikael Persson, Sep 26 '14 at 21:14
@MikaelPersson: It is not. It has no behavior. A small program that produces no output might be useful for generating an assembly listing so you can *study* the bug (if there is one), but only a self-contained program that produces visible output can definitively demonstrate the existence of a compiler bug. — Keith Thompson, Sep 26 '14 at 21:17
@KeithThompson: Are you saying that a function compiled inside an object file has no observable behavior? Of course it has observable behavior! The observable behavior of the `foo()` function is to set the extern variable `upper_bound` to zero. Observable behavior is not defined as "I can run it and see something pop out of it". The point of view of observability is memory, any change of memory (outside the stack frame of the function) is observable behavior. And that is the observable behavior a compiler is required to preserve across optimizations ("as-if" rule). — Mikael Persson, Sep 26 '14 at 21:35
Simple reproducible example - http://goo.gl/jiT3b2. Plainly hangs when optimization is enabled and compiled with GCC. — T.C., Sep 26 '14 at 21:39
@MikaelPersson: Yes, that's exactly what I'm saying. An optimizing compiler may make whatever transformations it likes, changing what's stored in memory for any non-`volatile` variable, by the "as-if" rule. It's less likely to do so across translation unit boundaries. Nevertheless, *if* there's a compiler bug, then it can be demonstrated via the output of some program. Conversely, if every valid program produces the output required by the standard, then you can't demonstrate that there's a bug. See section 1.9 of the C++ standard, [intro.execution]. — Keith Thompson, Sep 26 '14 at 21:40
@T.C.: Excellent! It would be interesting to see a program that produces different output (and doesn't go into an infinite loop), but I think you've nailed it. With g++ 4.8.2-19ubuntu1, it hangs with `g++ -O2` but not with `g++ -O1`. Daniel: I suggest including T.C.'s demo program in your question. And yes, I believe this is a compiler bug. (The program is also valid C; gcc's C compiler doesn't show the problem. Nor does clang++ 3.5.) — Keith Thompson, Sep 26 '14 at 21:50
@MikaelPersson: In case you didn't see T.C.'s comment, we now have a sample program that demonstrates the compiler bug by its run-time behavior. — Keith Thompson, Sep 26 '14 at 21:51
As far as I can tell, g++ seems to consider the `extern int upper_bound;` in `bar()` and the `extern int upper_bound;` in `foo()` to be different entities. Which is...odd, to say the least. — T.C., Sep 26 '14 at 21:52
@T.C.: Thanks for reproducing. After stripping our code down line by line I basically arrived at what I have posted above and it hangs VC11 with /O2. — Daniel Flassig, Sep 26 '14 at 21:58
Here's a variant that doesn't go into an infinite loop; it prints `ok` or `BUG!` depending on the optimization level. https://gist.github.com/Keith-S-Thompson/be3827ee243d4367f9bf — Keith Thompson, Sep 26 '14 at 22:00
@KeithThompson This program even produces linker errors when compiled with `-O3 -fwhole-program`. Strange. — dyp, Sep 26 '14 at 22:30
`-O2 -fno-tree-loop-im -fno-tree-pre` seem to fix it for an old gcc 4.5.4 — ninjalj, Sep 26 '14 at 22:48
[It seems that clang++ compiles correctly.](http://goo.gl/mwRNfX) — ikh, Sep 26 '14 at 23:58
It's fishy that MSVC and g++ both have the same obscure bug. Do they share part of their codebase? — M.M, Sep 27 '14 at 00:33
@MattMcNabb: I'd have thought any compiler would be liable to behave this way if this obscure scenario hadn't occurred to the developers. — Harry Johnston, Sep 27 '14 at 08:44

score 11 · Answer 1 · edited Jun 20 '20 at 09:12

The standard clearly and unambiguously clarifies that the two declarations of upper_bound refer to the same object.

3.5 Program and linkage [basic.link]

9 Two names that are the same (Clause 3) and that are declared in different scopes shall denote the same variable, function, type, enumerator, template or namespace if

both names have external linkage or else both names have internal linkage and are declared in the same translation unit; and

both names refer to members of the same namespace or to members, not by inheritance, of the same class; and

when both names denote functions, the parameter-type-lists of the functions (8.3.5) are identical; and

when both names denote function templates, the signatures (14.5.6.1) are the same.

Both names have external linkage. Both names refer to a member in the global namespace. Neither name denotes a function or a function template. Therefore, both names refer to the same object. Suggesting that the fact that you've got separate declarations invalidates such basic facts is like saying that int i = 0; int &j = i; j = 1; return i; might return zero, because the compiler might have forgotten what j refers to. Of course that must return 1. This has to work, plain and simple. If it doesn't, you've found a compiler bug.

You're mixing up the issues. The problem is at the compilation stage, and is about the rules for introducing names that are visible to name-lookup, and you are referring to a rule about how variables from declarations in different scopes are required to be *linked* to the same entities. And moreover, your example with `i` and `j` is about memory aliasing rules, which is irrelevant to this problem. — Mikael Persson, Sep 26 '14 at 23:07
@MikaelPersson The whole point of your answer is that the compiler may assume that `upper_bound_1` and `upper_bound_2` may not alias because they have separate declarations. `i` and `j` have separate declarations too. That's a bogus reason. The compiler has to emit code that works even if two separate declarations refer to the same object. If you disagree, show where the standard says the behaviour is undefined. (Don't just explain what a compiler might do, we've already seen that.) You won't find it anywhere. An `int` lvalue is used for modifying an `int` object. That's perfectly valid. — , Sep 27 '14 at 07:14

score -2 · Answer 2 · answered Sep 26 '14 at 22:12

-2

This behavior seems to be correct if you dig a bit into the standard.

First hint is in the note on at section 3.3.1/4, which says:

Local extern declarations (3.5) may introduce a name into the declarative region where the declaration appears and also introduce a (possibly not visible) name into an enclosing namespace;

Which is a little bit vague and seems to imply that compiler is not required to introduce the name upper_bound in the global context when passing through the bar() function, and therefore, when upper_bound appears in the foo() function, there is no connection made between those two extern variables, and therefore, bar() has no side-effect as far as the compiler knows, and thus, the optimization turns into an infinite loop (unless upper_bound is zero to begin with).

But this vague language is not enough, and it is only a cautionary note, not a formal requirement.

Fortunately, there is a precision later on, at section 3.5/7, which goes as follows:

When a block scope declaration of an entity with linkage is not found to refer to some other declaration, then that entity is a member of the innermost enclosing namespace. However such a declaration does not introduce the member name in its namespace scope.

And they even provide an example:

namespace X {
  void p() {
    q();              // error: q not yet declared
    extern void q();  // q is a member of namespace X
  }

  void middle() {
    q();              // error: q not yet declared
  }
}

which is directly applicable to the example you gave.

So, the core of the issue is that the compiler is required not to make the association between the first upper_bound declaration (in bar) and the second one (in foo).

So, let's examine the implication for optimization of the two upper_bound declarations are assumed to be un-connected. The compiler understands the code like this:

void bar() 
{
   extern int upper_bound_1;
   upper_bound_1--;
}

void foo()
{
   extern int upper_bound_2;
   for (int i = 0; i < upper_bound_2; ) {
      bar();
   }
}

Which becomes as follows, due to function inlining of bar:

void foo()
{
   extern int upper_bound_1;
   extern int upper_bound_2;
   while( 0 < upper_bound_2 ) {
      upper_bound_1--;
   }
}

Which is clearly an infinite loop (as far the compiler knows), and even if upper_bound was declared volatile, it would just have an undefined termination point (whenever upper_bound happens to externally be set to 0 or less). And decrementing a variable (upper_bound_1) an infinite (or indefinite) amount of times has undefined behavior, because of overflow. Therefore, the compiler can choose to do nothing, which is an allowed behavior when it's undefined behavior, obviously. And so, the code becomes:

void foo()
{
   extern int upper_bound_2;
   while( 0 < upper_bound_2 ) { };
}

Which is exactly what you see in the assembly listing for the function that GCC 4.8.2 produces (with -O3):

    .globl  _Z3foov
    .type   _Z3foov, @function
_Z3foov:
.LFB1:
   .cfi_startproc
    movl    upper_bound(%rip), %eax
    testl   %eax, %eax
    jle .L6
.L5:
    jmp .L5
    .p2align 4,,10
    .p2align 3
.L6:
    rep ret
    .cfi_endproc
.LFE1:
    .size   _Z3foov, .-_Z3foov

Which can be fixed by adding a global-scope declaration of the extern variable, as such:

extern int upper_bound;

void bar() 
{
   extern int upper_bound;
   upper_bound--;
}

void foo()
{
   extern int upper_bound;
   for (int i = 0; i < upper_bound; ) {
      bar();
   }
}

Which produces this assembly:

_Z3foov:
.LFB1:
    .cfi_startproc
    movl    upper_bound(%rip), %eax
    testl   %eax, %eax
    jle .L2
    movl    $0, upper_bound(%rip)
.L2:
    rep ret
    .cfi_endproc
.LFE1:
    .size   _Z3foov, .-_Z3foov

Which is the intended behavior, i.e., the observable behavior of foo() is equivalent to:

void foo()
{
   extern int upper_bound;
   upper_bound = 0;
}

answered Sep 26 '14 at 22:12

Mikael Persson

18,174
6
36
52

4

Just because the name introduced by a block-scope `extern` declaration is not visible to name lookup doesn't mean that they don't refer to the same entity. – T.C. Sep 26 '14 at 22:16
1

Basically, everything in your answer is correct up to the point where you say that the compiler is permitted to assume that `upper_bound_1` and `upper_bound_2` are stored in separate memory locations. That's a completely bogus assumption, and not supported by anything in your answer or in the standard. The compiler isn't required to know that they're stored in the same location, but if it doesn't know, it cannot pretend to know they *aren't* stored in the same location either. – Sep 26 '14 at 22:23
@T.C. They refer to the same entity, yes, for the linker, they are the same extern variable), but not for the compiler. At the first occurrence, the compiler marks it as "to-be-found variable called 'upper_bound'", and at the second occurrence the compiler marks it again as "to-be-found variable called 'upper_bound'". But at no point will the compiler say "those two are the same". This will only be resolved by the linker, at which point, it's too late. Do not confuse whole-program req. (e.g. "refer to the same entity") with TU req. (e.g., visible names in scopes). – Mikael Persson Sep 26 '14 at 22:25
2

Whether you're correct about what the standard says or not (I don't yet have an opinion on that), adding the `extern` declaration at the top is certainly a good idea and more clearly expresses the intent. – Keith Thompson Sep 26 '14 at 22:26
If the compiler cannot determine that `bar()` doesn't change the "local extern" `upper_bound` declared inside `foo()`, it *may* not assume that it doesn't change. – dyp Sep 26 '14 at 22:27
To elaborate, the sentence from 3.5/7 you quoted means that the block-scope `extern` doesn't introduce the name `q` into the enclosing namespace scope, even though it does make name `q` refer to the function `X::q()` inside `X::p()` . This is all fine and expected - it's the whole point of allowing block-scope extern declarations, to avoid polluting the namespace, and similar behavior happens elsewhere too (e.g., friend functions defined inside the class definition). But it doesn't mean that if `middle()` also declares `extern void q();`, that the two `q()` suddenly refer to different things. – T.C. Sep 26 '14 at 22:30
@T.C. The two names are required to refer to the same entity in a *whole program*. It's the linker's responsibility to resolve the name references to the same entity. At the stage of compilation and optimization, these rules do not apply yet. Only name lookup rules apply, and the `upper_bound` name from `bar` must be invisible in the context of `foo`'s body, and therefore, the two are not referring to the same variable, as far as the compiler is required to understand the code. The problem at hand is with name-lookups, not with linkage. The compiler does not do linkage! – Mikael Persson Sep 26 '14 at 22:46
@hvd It is not a bogus assumption. Given two extern variables `upper_bound_1` and `upper_bound_2`, the compiler is certainly allowed to assume that modifying one does not modify the other. These are separate variables (as far as the "ignorant" compiler knows), and cannot alias each other. It's just as if you had `int i,j;`, the compiler can assume that modifying `i` does not modify `j`, irrespective of linkage specification. So, that part of my answer is clearly valid, I stand by that. – Mikael Persson Sep 26 '14 at 23:03
3

@MikaelPersson Unless the compiler can definitively prove that the names *do not refer* to the same entity (and it obviously can't), it is not allowed to perform optimizations on the assumption that they don't. The standard draws no distinction between the compiler and the linker; it only cares about *implementations* as a whole. – T.C. Sep 26 '14 at 23:05
@T.C. The standard makes distinctions between things referring to the same declaration and things referring to the same entity (function, variable, etc.). The two extern declarations are required to be considered as not referring to the same declaration (unless there is a matching global declaration). It doesn't distinguish between compilation and linking, but it sets up two separate sets of rules for lookups and for linkage. And this problem simply brings to light a loop-hole (or defect) where the rules disagree, by having a variable refer to separate declarations but to the same entity. – Mikael Persson Sep 26 '14 at 23:22
There is no defect in the standard here whatsoever. The semantics of the C++ abstract machine for this code is well-defined - the two names refer to the same object, so the modification to that object made in `bar()` is visible in `foo()`. The implementation cannot perform optimizations beyond what is allowed by the as-if rule. – T.C. Sep 26 '14 at 23:28
@T.C. In other words, I consider this a bug in the standard, not a bug in the compilers. The rules are broken, not the compilers obeying them. Without this loop-hole, the fact that two variables refer to separate declarations should be sufficient "proof" that they cannot be referring to the same entity (or otherwise alias each other). But this loop-hole breaks that covenant, and that's the problem. – Mikael Persson Sep 26 '14 at 23:28
The observable behaviour of the C++ abstract machine is not described in terms of names and name lookup, but in terms of e.g. printing the *value* of an object (Keith Thompson's example). I'm not sure if you agree that the Standard requires those two declarations to refer to the same entity. But once that's established, the observable behaviour requires reproducing that output ("ok", not "BUG"). How the implementation achieves that goal is outside of the scope of the Standard. – dyp Sep 26 '14 at 23:51
@dyp I understand that and I agree, and that's why I would classify this as a defect in the standard. It is clear, to me, that the intent of the standard is for the rules of name lookup to set things up such that the compiler can simply follow them to associate names to declarations, and then assume that any two things that refer to two distinct declarations (as per those rules) are two distinct entities. But this situation throws a curve-ball where obeying name lookup rules breaks the linkage rules. It throws a monkey wrench in the works of the "as-if" rule. – Mikael Persson Sep 27 '14 at 00:20
Ah, I think I understand now where we disagree: *"that any two things that refer to two distinct declarations (as per those rules) are two distinct entities"* I do not know of any such guarantee. But doesn't [basic.link]/9 as quoted in hvd's answer say that there can be two names which are equal ("the same"), but referring to distinct declarations (name lookup finds different declarations); yet they can refer to the same entity? – dyp Sep 27 '14 at 00:24
@MikaelPersson if you were correct about `upper_bound_1` and `upper_bound_2` being different then the code should fail to link since they are both *odr-used* and there is not exactly one definition provided for each – M.M Sep 27 '14 at 00:29
@dyp Yes, when I say that the name-lookup rules disagree in intent with the linkage rules, I'm talking about to the clash between the rule I quoted (that pertains to name visibility) and the rule hvd quoted (that pertains to linkage). It's the only case I can see where a name-lookup rule sets the compiler on a brutal collision course with a linkage rule down the line. I would concede that a compliant compiler should bend the lookup rules to avoid that collision, but it's also hard to blame a compiler for following the rules (especially when this "collision case" is so rare in practice). – Mikael Persson Sep 27 '14 at 00:44
@MattMcNabb I used those names just to help explain what's happening, just as a visual aid. The compiler sees the two `upper_bound` names as distinct as far as lookup is concerned, but when it gets to the linking stage, the linker sees them as the same. But optimization goes on during compilation, not during linking, so, the compilation context is what is relevant to the observed problem and where it comes from. – Mikael Persson Sep 27 '14 at 00:49
Optimization which changes the observable behaviour is not permitted, and if the linker sees them as the same then the optimization is not permitted. Your last comment suggests you now think it is a compiler bug? – M.M Sep 27 '14 at 01:10
"that any two things that refer to two distinct declarations (as per those rules) are two distinct entities" is not only not guaranteed by the standard, but demonstrably false. Typedefs and using declarations aside, `extern "C"` declarations with the same name refer to the same function regardless of the namespace scope in which the declarations appear, per [dcl.link]/p6. – T.C. Sep 27 '14 at 01:44
@MattMcNabb I guess you could say that it's a compiler bug, but it's subtle than that. I would attribute the blame on the standard committee for creating this loop-hole in the standard, it creates a kind of "damned if you do, damned if you don't" situation for compiler vendors. And knowing the practical issues involved in a compiler "repairing" this lookup problem (it needs to re-do name lookups during the code-generation phase, after inlining has been initiated), I doubt that compiler vendors with this "bug" would even choose to do it, because of the memory and time overhead involved. – Mikael Persson Sep 27 '14 at 01:53
@T.C. The case of multiple extern "C" functions is not a problem. Even if the compiler assumes two functions are distinct entities because it sees two distinct declarations (e.g., same code as in this thread, but with the `upper_bound`s as extern functions), there is nothing dangerous that the compiler could do with them. All you can do with a function is call it or take its address, both of which cannot be completed until linking occurs, at which point they will be linked to the same function. The problem here would not be a problem if the extern variable was read-only, which functions are. – Mikael Persson Sep 27 '14 at 02:06
@MikaelPersson I don't see why it shouldn't assume that two identical names denoting an entity in the same scope actually denote the same entity – M.M Sep 27 '14 at 02:12
@MattMcNabb Huh...? Of course if two identical names appear in the same scope they denote the same entity. The problem at hand here is that there are two identical names in two *different scopes*. – Mikael Persson Sep 27 '14 at 02:28
MikaelPersson "When a block scope declaration of an entity with linkage is not found to refer to some other declaration, then that entity is a member of the innermost enclosing namespace." Both extern declarations share an innermost enclosing namespace. – M.M Sep 27 '14 at 02:36
@MattMcNabb Read the latter part: "However such a declaration does not introduce the member name in its namespace scope." The entity is a member of the namespace, but it's name is not introduced into it. The first part dictates what kind of linking symbol it should generate, and the latter part dictates how it should be treated for name lookup. So, there is no "identical names that appear in the same scope", there is one entity in the enclosing namespace, and a name in each block scope, but no name is introduced in the enclosing namespace. – Mikael Persson Sep 27 '14 at 02:42
@MikaelPersson `upper_bound` and `upper_bound` are identical names. They denote an entity which is in the same scope (namely, the innermost enclosing scope). I don't know why you wrote in quotemarks "identical names that appear in the same scope", I didn't say that. I said "identical names denoting an entity in the same scope". – M.M Sep 27 '14 at 02:44
@MattMcNabb This is getting ridiculous, I have better things to do than argue terminology with you. All the necessary information is already in this thread, read it more carefully if you don't understand it. – Mikael Persson Sep 27 '14 at 02:53
@MikaelPersson `extern "C"` variables behave the same way, i.e., multiple declarations with the same name refer to the same variable regardless of the namespace scope in which the declarations appear. – T.C. Sep 27 '14 at 02:57
@T.C. It's the same problem whether you have extern variable or extern "C" variable. The problem is with the same rule that I quoted in the answer. I'm tired of these pointless deliberations. – Mikael Persson Sep 27 '14 at 03:03
Sorry, that rule absolutely does not apply to `extern "C"` variables, since they must be declared at namespace scope, not block scope. – T.C. Sep 27 '14 at 03:06

Does C++ allow an optimizing compiler to ignore side effects on the for-condition?

2 Answers2