30

This C++ code, perhaps surprisingly, prints out 1.

#include <iostream>

std::string x();

int main() {

    std::cout << "x: " << x << std::endl;
    return 0;
}

x is a function prototype, which seems to be viewed as a function pointer, and C++ Standard section 4.12 Boolean conversions says:

4.12 Boolean conversions [conv.bool] 1 A prvalue of arithmetic, unscoped enumeration, pointer, or pointer to member type can be converted to a prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true. For direct-initialization (8.5), a prvalue of type std::nullptr_t can be converted to a prvalue of type bool; the resulting value is false.

However, x is never bound to a function. As I would expect, the C linker doesn't allow this. However in C++ this isn't a problem at all. Can anyone explain this behavior?

Columbo
  • 60,038
  • 8
  • 155
  • 203
chmullig
  • 13,006
  • 5
  • 35
  • 52
  • 6
    It's an ODR violation for which no diagnostic is required, meaning that your code has UB. – T.C. Nov 25 '14 at 21:03
  • 1
    @T.C. Ill-formed, not UB. – Lightness Races in Orbit Nov 26 '14 at 01:16
  • 4
    @LightnessRacesinOrbit It's ill-formed NDR, so per [intro.compliance]/2 ("If a program contains a violation of a rule for which no diagnostic is required, this International Standard places no requirement on implementations with respect to that program.") it is essentially UB ("behavior for which this International Standard imposes no requirements", [defns.undefined]). – T.C. Nov 26 '14 at 01:18
  • 1
    @T.C. Meh, I suppose so. Makes me wonder why they bother making a distinction between "ill-formed, no diagnostic required" and "the behaviour is undefined" in the first place, though. I'm sure there's a question about this somewhere... – Lightness Races in Orbit Nov 26 '14 at 01:25
  • 2
    @LightnessRacesinOrbit I think that's a special category for ODR violations – M.M Nov 26 '14 at 01:47
  • @MattMcNabb: It pops up in a few places actually, some having nothing even remotely to do with the ODR (such as mixing user-defined literal suffices in a sequence of concatenated string literals). – Lightness Races in Orbit Nov 26 '14 at 01:53
  • @LightnessRacesinOrbit I guess the difference is that "ill-formed, no diagnostic required" means the entire program has UB; whereas other forms of UB tend to not be triggered until the related code is executed – M.M Nov 26 '14 at 02:23
  • @MattMcNabb: That may be it – Lightness Races in Orbit Nov 26 '14 at 10:12
  • @MattMcNabb That sounds wrong. If a program contains UB, the behavior of the complete program is undefined, including all bits before that construct is engendered **and compile-time**. – Columbo Nov 26 '14 at 10:15
  • 1
    @Columbo That's patently untrue, most types of UB are only triggered by their statement being encountered. (e.g. `int f() { return 1 / 0; }` is OK as long as `f()` is never called). – M.M Nov 26 '14 at 21:33
  • @MattMcNabb Yeah, but if `main` does call it, `main` (that is, the program) doesn't contain UB. – Columbo Nov 26 '14 at 21:40
  • 1
    In many cases ill-formed NDR is for cases that theoretically *could* be diagnosed at compile time (but takes such an excessive amount of effort that it isn't worth it), while UB is for cases that may be impossible to diagnose at compile time. – T.C. Nov 26 '14 at 23:49

3 Answers3

28

What's happening here is that the function pointer is implicitly converted to bool. This is specified by [conv.bool]:

A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true

where "null pointer value" includes null function pointers. Since the function pointer obtained from decay of a function name cannot be null, this gives true. You can see this by including << std::boolalpha in the output command.

The following does cause a link error in g++: (int)x;


Regarding whether this behaviour is permitted or not, C++14 [basic.odr.ref]/3 says:

A function whose name appears as a potentially-evaluated expression is odr-used if it is the unique lookup result or the selected member of a set of overloaded functions [...]

which does cover this case, since x in the output expression is looked up to the declaration of x above and that is the unique result. Then in /4 we have:

Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.

so the program is ill-formed but no diagnostic is required, meaning that the program's behaviour is completely undefined.

Incidentally this clause implies that no link error is required for x(); either, however from a quality-of-implementation angle; that would be silly. The course that g++ has chosen here seems reasonable to me.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • It is. See my answer. – Columbo Nov 25 '14 at 21:24
  • I see that you've quoted N3797, which has a sexier wording than N3337. Will adjust that. :o) – Columbo Nov 25 '14 at 21:29
  • 1
    @Columbo N3936 actually (which is identical to C++14 afaik) – M.M Nov 25 '14 at 21:30
  • Why didn't you use N4140? (Just curious) – Columbo Nov 25 '14 at 21:32
  • @Columbo have you got a download link? (as suggested by the SO C++ document page , authentication is required from the CWG page) – M.M Nov 25 '14 at 21:35
  • @Columbo ty. I guess that should be added to the SO document compilation – M.M Nov 25 '14 at 21:49
  • 2
    @Columbo: Quoting working drafts to prove standard behaviours is not very useful, as they can and do change significantly between standards, with bits added and removed in the meantime. It's always better to quote _standards_ or, if needs be, the last working draft that editorially became a standard. That's what Matt did (yes, N3936 is _effectively_ C++14). – Lightness Races in Orbit Nov 26 '14 at 01:16
  • 1
    @LightnessRacesinOrbit N4140 contains only editorial changes compared to N3936, and one of those changes is particularly nice for quoting (numbered bullets). – T.C. Nov 26 '14 at 01:21
  • 2
    @T.C. Yes but in general, why would you push somebody to throw aside actual International Standard wording in favour of a draft, when the intention is to quote a standard? – Lightness Races in Orbit Nov 26 '14 at 01:24
14

X doesn't need to be "bound" to a function, because you stated in your code that such function exists. So compiler can safely assume, that the address of this function must not be NULL. For that to be possible, you'd have to declare the function to be a weak symbol, and you didn't. Linker did not protest, because you never call your function (you never use its actual address), so it sees no problem.

Freddie Chopin
  • 8,440
  • 2
  • 28
  • 58
  • Sure, the *compiler* can assume that. But the *linker* actually links it and produces a working executable. – chmullig Nov 25 '14 at 21:00
  • 4
    `1` isn't the address of the function though. (If you add in `x1`, `x2`, `x3` etc. they all get `1`) – M.M Nov 25 '14 at 21:01
  • 3
    @chmullig - See the edited answer - linker never sees your `x` symbol, because the compiler never uses it - it optimized this "test", because according to language rules it is always true. This would work the same way if you'd have an `extern` variable and would test it's address. – Freddie Chopin Nov 25 '14 at 21:03
  • @FreddieChopin: what does it mean that "you'd have to declare the function to be a weak symbol". I didn't understand it. what is this weak symbol? – Destructor Jan 18 '16 at 13:36
9

[basic.def.odr]/2:

A function whose name appears as a potentially-evaluated expression is odr-used if it is the unique lookup result or the selected member of a set of overloaded functions (3.4, 13.3, 13.4), unless it is a pure virtual function and its name is not explicitly qualified.

Hence, strictly speaking, the code odr-uses the function and therefore requires a definition.
But modern compilers will realize that the functions exact address is not actually relevant for the behavior of the program, and will thus elide the use and not require a definition.

Also note what [basic.def.odr]/3 specifies:

Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.

An implementation is not obliged to halt compilation and issue an error message (=diagnostic). It can do what it considers best. In other words, any action is allowed and we have UB.

Columbo
  • 60,038
  • 8
  • 155
  • 203