Why is GCC tricked into allowing undefined behavior simply by putting it in a loop?

Question

The following is nonsensical yet compiles cleanly with g++ -Wall -Wextra -Werror -Winit-self (I tested GCC 4.7.2 and 4.9.0):

#include <iostream>
#include <string>

int main()
{
  for (int ii = 0; ii < 1; ++ii)
  {
    const std::string& str = str; // !!
    std::cout << str << std::endl;
  }
}

The line marked !! results in undefined behavior, yet is not diagnosed by GCC. However, commenting out the for line makes GCC complain:

error: ‘str’ is used uninitialized in this function [-Werror=uninitialized]

I would like to know: why is GCC so easily fooled here? When the code is not in a loop, GCC knows that it is wrong. But put the same code in a simple loop and GCC doesn't understand anymore. This bothers me because we rely quite a lot on the compiler to notify us when we make silly mistakes in C++, yet it fails for a seemingly trivial case.

Bonus trivia:

If you change std::string to int and turn on optimization, GCC will diagnose the error even with the loop.
If you build the broken code with -O3, GCC literally calls the ostream insert function with a null pointer for the string argument. If you thought you were safe from null references if you didn't do any unsafe casting, think again.

I have filed a GCC bug for this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63203 - I'd still like to get a better understanding here of what went wrong and how it may impact the reliability of similar diagnostics.

Welp, +1 to clang: http://coliru.stacked-crooked.com/a/b8bde7751f3c2c0c — Mark Garcia, Sep 08 '14 at 05:09
Perhaps you should fill a bug report on [GCC Bugzilla](https://gcc.gnu.org/bugzilla) ... — Basile Starynkevitch, Sep 08 '14 at 05:10
@BasileStarynkevitch: absolutely. The question here is a higher-level one: how/why should GCC be so easily tricked into accepting code it knows is wrong? I'd like to better understand this; it took me a while to build the right test case for this because the first few I tried didn't fool GCC--I was shocked that a simple loop would defeat the warning. Understanding why such things happen might help in trying to make sure our code isn't broken (previously I would have assumed I did not need to check for self-init of references, now I know I do need to). — John Zwinck, Sep 08 '14 at 05:14
File a bug. Complaining about GCC bugs here is not particularly productive. — n. m. could be an AI, Sep 08 '14 at 05:29
In some ways a duplicate of my question... http://stackoverflow.com/q/25151508/541686 — user541686, Sep 08 '14 at 05:29
@n.m.: I have, and linked it in my post (as an edit, but well before your comment). I am not complaining about the GCC bug here at all, but trying to understand how this sort of thing occurs within a modern compiler. — John Zwinck, Sep 08 '14 at 05:37
Again: it's a bug. Why bugs happen in GCC in unexpected places? Because GCC is a huge an complex beast, that's why. You think this is so simple and easy case this area should be bug-free? Volunteer to maintain it. — n. m. could be an AI, Sep 08 '14 at 05:44
[dcl.ref]#5 "A reference shall be initialized to refer to a valid object or function". Also I think it is UB to use the name of a reference before it has been bound although I can't find the relevant text right now — M.M, Sep 08 '14 at 05:46
By the way it's [bug 18501](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501), from 2004. — n. m. could be an AI, Sep 08 '14 at 05:53
@MattMcNabb: Does the fact that it's a reference actually matter? For example, is `Foo foo(&foo);` valid since `Foo` is not a reference? — user541686, Sep 08 '14 at 06:12

Jonathan Wakely · Accepted Answer · 2014-09-10T11:19:15.533

I'd still like to get a better understanding here of what went wrong and how it may impact the reliability of similar diagnostics.

Unlike Clang, GCC doesn't have logic to detect self-initialized references, so getting a warning here relies on the code for detecting use of uninitialized variables, which is quite temperamental and unreliable (see Better Uninitialized Warnings for discussion).

With an int the compiler can figure out that you write an uninitialized int to the stream, but with a std::string there are apparently too many layers of abstraction between an expression of type std::string and getting the const char* it contains, and GCC fails to detect the problem.

e.g. GCC does give a warning for a simpler example with less code between the declaration and use of the variable, as long as you enable some optimization:

extern "C" int printf(const char*, ...);

struct string {
  string() : data(99) { }
  int data;
  void print() const { printf("%d\n", data); }
};

int main()
{
  for (int ii = 0; ii < 1; ++ii)
  {
    const string& str = str; // !!
    str.print();
  }
}

d.cc: In function ‘int main()’:
d.cc:6:43: warning: ‘str’ is used uninitialized in this function [-Wuninitialized]
   void print() const { printf("%d\n", data); }
                                           ^
d.cc:13:19: note: ‘str’ was declared here
     const string& str = str; // !!
                   ^

I suspect this kind of missing diagnostic is only likely to affect a handful of diagnostics which rely on heuristics to detect problems. These would be the ones that give a warning of the form "may be used uninitialized" or "may violate strict aliasing rules", and probably the "array subscript is above array bounds" warning. Those warnings are not 100% accurate and "complicated" logic like loops(!) can cause the compiler to give up trying to analyse the code and fail to give a diagnostic.

IMHO the solution would be to add checking for self-initialized references at the point of initialization, and not rely on detecting it is uninitialized later when it gets used.

score 1 · Answer 2 · answered Sep 08 '14 at 05:15

1

You claim it's undefined behavior, but when I compile the two cases to assembly, I definitely see the function-scoped variable not being initialized on the stack, and the block-scoped variable getting set to NULL.

That's as much of an answer as you're getting from me. I downloaded the C++ spec to definitively settle this, but fell into a Lovecraftian-type fugue when I gazed upon it, to preserve my fragile sanity...

I strongly suspect the block-scoped case is not actually undefined.

answered Sep 08 '14 at 05:15

BadZen

4,083
2
25
48

2

Using a null reference is undefined behavior. Which of the two lines the UB occurs on may be up for debate, but I think it's clearly UB (and it does segfault). – John Zwinck Sep 08 '14 at 05:20
Sorry, I meant non-initialized. (Hence, no -Werror-uninitialized warning.) – BadZen Sep 08 '14 at 05:22
Well I never claimed anything about non-initialization. GCC did, when I commented out the loop. Whether it says non-initialized or initialized-with-self in the diagnostic doesn't matter to me, I just expect a diagnostic! – John Zwinck Sep 08 '14 at 05:26
Hey, look there's evidently a -Winit-self that is separate from -Wall because of "the common but questionable practice of initializing auto storage class variables with themselves to supress warnings" Spoiler: enabling it doesn't cause your test case to emit a warning. =p – BadZen Sep 08 '14 at 05:32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18016 Fixed in 4.7.0 my foot. Go ahead and submit, I think... – BadZen Sep 08 '14 at 05:33
yes, I used `-Winit-self` in the question as you can see. And you're right, it doesn't help. – John Zwinck Sep 08 '14 at 05:35
@JohnZwinck I dont' see any null references – M.M Sep 08 '14 at 05:42
@MattMcNabb: well that's what GCC emits when you compile with optimization: the ostream insert operator is called with a constant `0` instead of a pointer to a string. So while the code doesn't intentionally build a null reference per se, GCC emits code as if it did. – John Zwinck Sep 08 '14 at 06:12
1

@BadZen It *is* fixed (at least in 4.8 on): the problematic testcase in that report does produce warnings as expected. Seeing as the OP’s case involves no member variable, the relevance of the report you’ve found is questionable. – Luc Danton Sep 08 '14 at 07:37
1

As @LucDanton says, PR18016 is unrelated, it is about self-initialized member variables, which is questionable but not undefined behaviour, so deserves a warning. Self-initialized references are ill-formed and should be an error, always, but 18016 has nothing to do with references. – Jonathan Wakely Sep 08 '14 at 08:45
(Actually clang and EDG only give a warning for self-initialized references, which matches the OP's statement that it's UB not ill-formed, but it's still nothing to do with 18016 :-) – Jonathan Wakely Sep 08 '14 at 09:01
@JonathanWakely: I wish self-initialization of references were ill-formed, but I have not found anything to suggest it is. I am now wondering if this is something to suggest to the C++ committee for C++17 or so. It would seem there is no possible valid use of this, but I am prepared for some to come out of the woodwork (perhaps SFINAE). – John Zwinck Sep 08 '14 at 11:50
SFINAE should go away in favour of concepts in C++17 anyway, so that isn't a good reason. Someone might have a reason, but I'd say it's worth proposing it to WG21. Maybe the Undefined Behaviour study group would be interested. – Jonathan Wakely Sep 08 '14 at 12:13
@JonathanWakely: I have written to ub@isocpp here: http://www.open-std.org/pipermail/ub/ (it's pending moderator approval). Thank you. – John Zwinck Sep 10 '14 at 11:17
@JonathanWakely: ...and my posting to the UB mailing list was rejected by the moderators. There have been no messages accepted for roughly three months--maybe the UB group is no longer? Anyway, if you have the right ears listening you're more than welcome to steal this proposal. – John Zwinck Sep 11 '14 at 11:01
1

@JohnZwinck, the SG12 chair tells me you need to subscribe to the list before posting, the list is active. – Jonathan Wakely Sep 11 '14 at 14:48
@JonathanWakely: my message was accepted this time; the thread is here for posterity: http://www.open-std.org/pipermail/ub/2014-September/000506.html and some discussion has begun already. Thanks again. – John Zwinck Sep 14 '14 at 10:05

Why is GCC tricked into allowing undefined behavior simply by putting it in a loop?

2 Answers2

Linked