18

I've been experimenting with constexpr. On my test compiler (g++ 4.6) this fails to compile with an error about out of bounds access. Is a compiler required to spot this at compile time?

#include <iostream>

constexpr const char *str = "hi";

constexpr int fail() {
  return str[1000]; // Way past the end!
}

template <int N>
struct foo {
  static void print() { std::cout << N << std::endl; }
};

int main() {  
  foo<fail()>::print();
}
Flexo
  • 87,323
  • 22
  • 191
  • 272
  • 1
    The compiler has to be able to determine the value of `fail()` at compile time. Since it cannot, it produces an error. Sounds logical to me. – Kerrek SB Sep 14 '11 at 18:45
  • @Kerrek - But if it's undefined behaviour it would be quite reasonable for an implementation of a compiler to pick a random value for `N` or to fail during linking. Or to fail in some *weird* way at run time, or even during compile time? – Flexo Sep 14 '11 at 18:51
  • 1
    I can't see any discussion of it in `§ 5.19 [expr.const]` in the draft I have. Clearly you don't want buffer overflows in your compiler (lookout ideone!), but that doesn't mean the only sane solution is an error message. – Flexo Sep 14 '11 at 18:53
  • 1
    Re UB: In this case, with the bounds known & violated at compile-time, it would even require work to produce weird behaviour (at least without comprimising the whole compiler). I hope other compiler writers will be sane enough to produce errors as well, even if it will not be required by the standard. – Georg Fritzsche Sep 14 '11 at 18:59
  • @Georg - I agree you wouldn't want to ship a compiler without doing something and spotting it is so easy that it would be hard to argue against it I suspect, but I was puzzled because of all the "classic" UB I'd considered so far this one is the only one I can't see explicitly prohibited. It's more of an academic question than a practical one. – Flexo Sep 14 '11 at 19:07
  • @awoodland: I don't think this is undefined behaviour. It's just a plain error. The compiler knows that it doesn't know the value of `fail()`. – Kerrek SB Sep 14 '11 at 19:45
  • @Kerrek - as much as I'd like to believe that were the case or at least the intention of the standard I'm having a hard time squaring it with statments like § 7.1.5 - 7 *"A call to a `constexpr` function produces the same result as a call to an equivalent non-`constexpr` function in all respects except that a call to a `constexpr` function can appear in a constant expression"*. (It would also make the trivial way of adding `constexpr` support to an existing compiler, i.e. build and execute on the side another program as part of compilation that evaluates `constexpr`s infeasible) – Flexo Sep 14 '11 at 21:11
  • @awoodland - the compiler can't produce something that is equivalent to a non-constexpr function, so it fails to compile. It could say "compilation error: cannot ensure that result follows the standard", but that wouldn't be very user-friendly. – Tomer Vromen Sep 14 '11 at 21:43
  • This question inspired me to ask a related question I've been wondering about for a while: http://stackoverflow.com/questions/7424647/can-a-constant-expression-subscript-a-string-literal – Potatoswatter Sep 15 '11 at 00:41

2 Answers2

11

§5.19/2 (on the second page; it really should be split into many paragraphs) forbids constant expressions containing

— an lvalue-to-rvalue conversion (4.1) unless it is applied to

— a glvalue of integral or enumeration type that refers to a non-volatile const object with a preceding initialization, initialized with a constant expression, or

— a glvalue of literal type that refers to a non-volatile object defined with constexpr, or that refers to a sub-object of such an object

str[1000] translates to * ( str + 1000 ), which does not refer to a subobject of str, in contrast with an in-bounds array access. So this is a diagnosable rule, and the compiler is required to complain.

EDIT: It seems there's some confusion about how this diagnosis comes about. The compiler checks an expression against §5.19 when it needs to be constant. If the expression doesn't satisfy the requirements, the compiler is required to complain. In effect, it is required to validate constant expressions against anything that might otherwise cause undefined behavior.* This may or may not involve attempting to evaluate the expression.

 * In C++11, "a result that is not mathematically defined." In C++14, "an operation that would have undefined behavior," which by definition (§1.3.24) ignores behavior that the implementation might define as a fallback.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • It's certainly wrong, but can you prove that it's diagnosable please? – Lightness Races in Orbit Sep 15 '11 at 00:00
  • 1
    `s/diagnosable/required to be diagnosed/` I can count on one hand the number of things of this ilk that require diagnostics, and I'd be amazed were this one of them. – Lightness Races in Orbit Sep 15 '11 at 00:01
  • @Tomalak: §1.4/1: The set of diagnosable rules consists of all syntactic and semantic rules in this International Standard except for those rules containing an explicit notation that “no diagnostic is required” or which are described as resulting in “undefined behavior.” – Potatoswatter Sep 15 '11 at 00:04
  • You're right, and that's a very clever reading. The result of the dereference is undefined, and since it's undefined it cannot be a `constexpr`. – Omnifarious Sep 15 '11 at 00:28
  • OK, but if it's undefined then no diagnostic is required. – Lightness Races in Orbit Sep 15 '11 at 00:52
  • @Tomalak: It's not undefined. 5.19/2 contains no annotation about “no diagnostic is required” or “undefined behavior.” Diagnosability is the *default*. I'm not where Omni got "The result of the dereference is undefined", or how you can count required diagnostics on one hand, but this has absolutely nothing to do with UB. – Potatoswatter Sep 15 '11 at 00:55
  • Deferencing an invalid object is undefined. I cba to look it up at this hour -- might give it a go in the morning -- but I'm fairly sure of that. – Lightness Races in Orbit Sep 15 '11 at 00:59
  • 2
    @Tomalak: No dereference occurs because the expression cannot be formed in the first place. §5.19 describes what can be included in a constant expression. This cannot. End of story. – Potatoswatter Sep 15 '11 at 01:02
  • Where's the lvalue-to-rvalue conversion pre-dereference? – Lightness Races in Orbit Sep 15 '11 at 01:04
  • @Tomalak: The dereference is identical to the lvalue-to-rvalue conversion. Because it is forbidden by 5.19, there is a diagnosable rule, and diagnosable means diagnosis is required. – Potatoswatter Sep 15 '11 at 01:11
  • @Potatoswatter - Is it allowed if it's `str[1]` rather than `str[1000]`? – Omnifarious Sep 15 '11 at 03:33
  • 2
    @Omni: `str[1]` is a subobject of a a glvalue of literal type that refers to a non-volatile object defined with constexpr. `str[1000]` is not. One is is well-defined, the other is disallowed. – Potatoswatter Sep 15 '11 at 03:40
  • @Omni: Depending how you parse the text, it might not be allowed by this rule because the string literal is not defined as `constexpr`. See my question http://stackoverflow.com/questions/7424647/can-a-constant-expression-subscript-a-string-literal . But that's really a separate issue — if you use an appropriate array, and set a pointer to point into it, the compiler *is* required to track the array bound. – Potatoswatter Sep 15 '11 at 03:48
  • @Potatoswatter: So, I think 'undefined behavior' is relevant, it's just that since the dereference is never performed, the undefined behavior is never invoked. The compiler notices that the dereference would result in undefined behavior, which is most definitely not a constexpr and gives a required diagnostic. Otherwise, I can't see how `str[1]` and `str[1000]` are any different. – Omnifarious Sep 15 '11 at 03:49
  • 1
    @Omni: Exactly. Except that there's no broad inference about finding anything that might might cause UB, it's just an application of a particular rule of 5.19. An example of a constant expression containing UB would be signed integer overflow. One could reasonably expect wraparound, a compile-time error, or a trapping value getting compiled into the program and causing a problem at runtime. – Potatoswatter Sep 15 '11 at 03:52
  • OK, yes the expression `str+1000` causes this diagnosable behaviour before the dereference takes place. It was the constant vague use of "it" that I was trying to clear up :) – Lightness Races in Orbit Sep 15 '11 at 08:45
  • @Potatoswatter - Interestingly, causing a signed integer overflow in the evaluation of a `constexpr` results in an error, or if you turn off the error, the expression loses `constexpr` status. – Omnifarious Sep 15 '11 at 22:49
  • @Omni: Answering another question, I just noticed §5/5: "If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, [and] such an expression is a constant expression (5.19), […] the program is ill-formed." So that error is required, too, not UB. The answer seems right so far and my previous comment is wrong. – Potatoswatter Sep 17 '11 at 04:46
  • @Potatoswatter - I saw that quote before and read it as e.g. division by 0 and interger under/overflow, but didn't read invalid array index in any of the things mentioned there. – Flexo Sep 17 '11 at 20:26
  • @awoodland: Yes, the discussion between Omni and me got off-topic. §5/5 does not relate to an invalid array index. – Potatoswatter Sep 17 '11 at 21:07
  • @Potatoswatter: If an implementation specifies that when `y` is zero the expression `x/y` will yield `x` (which from what I can tell would be perfectly legal), would it be legal for the compiler to regard `int x=42/0;` as being equivalent to `int x=42;` since, while the result wouldn't fit with the normal rules of integer math, would fit with the way the implementation legitimately defines the `/` operator's mathematical behavior. – supercat May 18 '15 at 21:07
  • @supercat That is still undefined behavior within the scope of the standard. Something like `42/(1+1.e-50-1)` could be implementation-dependent, though. – Potatoswatter May 19 '15 at 05:45
  • @Potatoswatter: Unless something has changed, the Standard defines Undefined Behavior as behavior over which it imposes *no requirements*. If the Standard is saying that `int x=42/0;` isn't allowed to set `x` to 42, then it is imposing a requirement. While it may be reasonable for the standard to require that implementations forbid `int x=42/0;`, that would imply that it was not "Undefined Behavior", but rather *forbidden* behavior. – supercat May 19 '15 at 14:45
  • @supercat If UB happens, then the implementation can do anything. In the specific context of constant expression evaluation, the implementation is required to detect *causes* of UB, *before* it would happen and 1) if the context requires a constant expression, throw an error, or otherwise 2) perform the evaluation at runtime. – Potatoswatter May 19 '15 at 17:03
5

Yes, the compiler is supposed to catch this at compile time, if we look at section 5.19 Constant expressions paragraph 2 of the draft C++ standard it lists this as an exclusion for constant expressions:

an operation that would have undefined behavior [ Note: including, for example, signed integer overflow (Clause 5), certain pointer arithmetic (5.7), division by zero (5.6), or certain shift operations (5.8) —end note ];

and issue 695 as far as I can tell says that undefined behavior is non-const and should issue a diagnostic:

The consensus of the CWG was that an expression like 1/0 should simply be considered non-constant; any diagnostic would result from the use of the expression in a context requiring a constant expression.

You can find more details at my self answered question here which also goes into the uses of this feature.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740