18

Let's say you know your software will only run on two's complement machines where signed overflow behavior is nicely defined. Signed overflow is still undefined behavior in C and C++ and the compiler is free to replace your entire program with "ret", start a nuclear war, format your drive, or make demons fly out of your nose.

Suppose you have signed overflow in inline asm, does your program still invoke UB?

If yes, What about separately compiled and linked assembler?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Eloff
  • 20,828
  • 17
  • 83
  • 112
  • 1
    This is a very interesting question. I want to know as well. – callyalater Mar 01 '16 at 18:32
  • 6
    Is inline asm even standard C or C++ in the first place? I honestly don't know. If not, the whole question stands on shaky ground I feel. – Baum mit Augen Mar 01 '16 at 18:33
  • 3
    Well there is [dcl.asm]: *The `asm` declaration is conditionally-supported; its meaning is implementation-defined.* – NathanOliver Mar 01 '16 at 18:34
  • For C99 the only thing I found is about the asm keyword: _J.5.10_ The asm keyword may be used to insert assembly language directly into the translator output (6.8). – fsasm Mar 01 '16 at 18:44
  • 2
    A lot of undefined behaviour is actually defined by the implementation. "Undefined behaviour" does *not* mean "the implementation must make the behaviour as random and evil as possible". – Christian Hackl Mar 01 '16 at 20:29
  • @ChristianHackl no, but it does mean "the implementation may make the behavior anything it wants to" - which like GCC with signed overflow means pretending it can't wrap around, and optimizing away overflow checks of the form x + 1 < x AND the error case code - which is pretty evil IMHO considering that broke a lot of old, working code bases. – Eloff Mar 02 '16 at 16:43
  • [Near duplicate](http://stackoverflow.com/questions/40565835/is-integer-overflow-undefined-in-inline-x86-assembly), but with a specific example: GNU C inline-asm for x86. My answer there explains in details why that instruction can't cause any weird behaviour on x86, and also points out some things that *are* undefined by the x86 manuals (where a flag could be 0 or 1, depending on the HW, but the possibilities don't include nasal demons for any unprivileged instructions). IDK if these should be merged, or just linked somehow? Maybe edit links into the questions, not close as dup. – Peter Cordes Nov 30 '16 at 06:36
  • @Eloff: I don't consider it astonishing when implementations sometimes perform int math in a fashion consistent with mathematical arithmetic, even when the result exceeds the range of "int" [e.g. deciding that x+1>y is equivalent to x>=y]. What I do consider astonishing are implementations which will use evaluation of x+1>y as a basis for omitting a comparison *elsewhere* which would test x against INT_MAX. IMHO, if a programmer wants to test the wrapping behavior, a quality compiler should allow the expression to be written as (int)(x+1)>y and process it using two's-complement wrapping. – supercat Nov 30 '16 at 20:35
  • @Eloff: Otherwise, I would suggest that a non-sanitizing build on a quality platform should process `x+1 > y` so that it yields 0 or 1, chosen in Unspecified fashion, if x equals INT_MAX. Letting a compiler arbitrarily yield 0 or 1 in such cases may allow useful optimizations; further, I would consider `(int)(x+1) > y` as clearer than `x+1 > y` in cases where two's-complement wrapping is expected and required. – supercat Nov 30 '16 at 20:39

2 Answers2

12

"Undefined behaviour" means the C resp. C++ standards don't define the behaviour of your program. If your program contains inline assembly, it should be pretty clear that its behaviour won't normally be described by either the C or the C++ standard. Some other standard might even define the behaviour, but that still doesn't mean "defined behaviour" in the context of the C or C++ standard.

That said, the C standard does require documentation of supported extensions. If the behaviour of your program can be inferred from your implementation's documentation, and your implementation makes your program behave differently, that is a failure of your implementation to conform to the standard:

4. Conformance

8 An implementation shall be accompanied by a document that defines all implementation-defined and locale-specific characteristics and all extensions.

For C++, this requirement has been weakened:

1.4 Implementation compliance [intro.compliance]

9 Each implementation shall include documentation that identifies all conditionally-supported constructs that it does not support and defines all locale-specific characteristics.

and

1.9 Program execution [intro.execution]

2 Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined [...] Each implementation shall include documentation describing its characteristics and behavior in these respects. [...]

I'm unable to find a requirement for extensions to be documented, and if documented, to be documented correctly. This would suggest that in C++, even if your implementation defines the behaviour of your program as an extension, if it turns out the documentation is wrong, that's just too bad.

For the C++ semi-standard asm statement (as mentioned in the comments, "The asm declaration is conditionally-supported; its meaning is implementation-defined."), if your implementation supports it it needs to be documented, but of course it's common practice for implementations to support inline assembly in a different manner than hinted by the C++ standard, so this doesn't give you much extra.

Community
  • 1
  • 1
  • Ok, so according to the standard it's implementation defined as to what inline asm means to the program. This begs the question then, what do popular compilers actually do? – Eloff Mar 01 '16 at 19:01
  • @Eloff That's a very different question from what you asked in this one and takes too much to answer here. –  Mar 01 '16 at 19:06
3

As soon as you say that you have signed overflow in inline asm, it means that you are speaking of a particuliar compiler (or a set of compiler) because in C as in C++ the support for asm declaration and its meaning are compiler defined.

If the compiler defines the asm keyword by allowing direct inclusion of assembly code in its output and if the machine allows signed overflow, then a signed overflow in the inline asm is perfectly defined for that compiler and that machine: it is what the processor will give as result. You should still control whether it can result in a trap representation for a signed integer but anyway it is defined. The only case that would end in UB would be when the compiler says that some representation in signed integer will cause undefined behaviour. But I know none that do and you are already in the context of a defined and finite set of compilers and machines.

Separate compilation of an assembly module and C and/or C++ code would be the same for that set of compilers and machines: result is implementation defined which is not the same as UB.

Another example of something that is explicitely implementation defined in the standards (both C and C++) is whether char type is signed or not: if you do not know what compiler you use, you cannot rely on it, but as soon as you choose a compiler implementation, that implementation is required to say whether it is signed or unsigned, and it is not undefined behaviour, meaning that the compiler cannot replace the full code with a ret for example.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252