1

As far as I can tell, compiler extensions may be considered undefined rather than implementation-defined. I am guessing (but do not know for sure) that this applies to the C++ standard as well as C standard.

Both GCC and LLVM offer an -fexceptions feature that appears to ensure that throwing an exception from C++ code through C code and then catching it in C++ code will behave as expected, i.e., unwinding the stack frames in both C and C++ and invoking the destructors for the C++ locals. (Note: I understand that resources allocated in the C stack frames being unwound will not be freed. That is not part of my question.) Here is the relevant text from the GCC documentation:

If you do not specify this option, GCC enables it by default for languages like C++ that normally require exception handling, and disables it for languages like C that do not normally require it. However, you may need to enable this option when compiling C code that needs to interoperate properly with exception handlers written in C++.

However, I cannot find anything in the C or C++ standards indicating how stack-unwinding should be expected to interact with a stack containing frames compiled from different source languages. The C++ standard appears to only mention unwinding in 15.2, except.ctor, which simply explains the rules regarding destroying local objects when an exception is thrown.

Therefore, is passing an exception through C code undefined behavior, even using a language extension designed to make it work in a well-defined way? Is using such an implementation-provided extension "wrong"?

For context, this question is inspired by two fairly lengthy discussions in the Rust community about stack-unwinding through C code:

Kyle Strand
  • 15,941
  • 8
  • 72
  • 167
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/192401/discussion-on-question-by-kyle-strand-is-relying-on-gccs-llvms-fexceptions). – Samuel Liew Apr 26 '19 at 00:23
  • what does 'technically undefined' mean? Code written in Python is undefined by the C++ standard, but that's not a terribly meaningful statement. – Chris Dodd Apr 26 '19 at 06:29
  • @ChrisDodd I've removed the word "technically". – Kyle Strand Apr 29 '19 at 17:03

3 Answers3

2

In the sense that C does not define what happens when you call a function written in a language other than C, much less what happens if that function fails to return but instead ends its lifetime and the lifetime of the C caller in some other way, yes, it is undefined behavior. It is not "implementation-defined behavior", because the defining characteristic of implementation-defined behavior is that the language standard imposes a requirement on implementations that they document a particular behavior, and that is not the case here; the topic in question is completely outside the scope of the relevant standard.

From a standpoint of reasonable and portable C programming, you should not use or depend on -fexceptions and C++ code that's intended to be called from C should catch all exceptions in the outermost extern "C" function (or function exposed via a function pointer to C callers) and translate them into error codes or some mechanism compatible with C (e.g. a longjmp, but only if it's documented that the C caller has to be prepared for the callee to do so).

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 2
    I understand that it's not implementation-defined, but since, as you said, it's outside the scope of the standard, why is it not "reasonable" to use or depend on an implementation-provided extension? (Your second paragraph reads as a "yes" to my question of "is it wrong", but I don't understand your rationale.) – Kyle Strand Apr 23 '19 at 20:29
  • If you don't value portability and writing to standards, why bother even asking this question? Beyond that, it's a "heavy" extension that depends on multiple components (compiler, ABI, runtime, etc.) that can't be substituted with simple wrappers or replacements, like many extensions can. – R.. GitHub STOP HELPING ICE Apr 24 '19 at 03:16
  • Nothing I've said should imply that I "don't value portability and writing to standards". True, relying on an extension limits portability somewhat, but all three major implementations provide this feature (on Windows, both C++ exceptions and C `longjmp` are implemented with SEH). And of course there are situations (e.g. when writing an OS) in which relying on extensions appears to be unavoidable. – Kyle Strand Apr 24 '19 at 22:03
  • @KyleStrand: "All three major implementations" has nothing to do with portability, which is more impoirtant from a standpoint of *potential future implementations*, freedom from constraint for future implementations, and non-monoculture. Implementation of `longjmp` with SEH is probably non-conforming. Your "situations" (like writing an OS) do not depend on any heavy/invasive extensions like exceptions. In practice you barely need more than extern calls to asm, or a swapcontext-like primitive, to do an OS; nearly everything else can be portable C. – R.. GitHub STOP HELPING ICE Apr 24 '19 at 22:35
  • You're being kind of a jerk now, and I'm not sure why. Of course the behavior of major implementations is relevant to portability. And I don't understand why SEH would be nonconforming. – Kyle Strand Apr 25 '19 at 15:34
  • @KyleStrand: I'm not sure if C++ makes it undefined behavior to `longjmp` over dtors, or if it specifies that they don't run. In the latter case, SEH causing them to run would be nonconforming. – R.. GitHub STOP HELPING ICE Apr 25 '19 at 16:53
  • Ah, you mean nonconforming for C++, not for C. I understand now. `cppreference` says that `longjmp`ing over objects with nontrivial destructors is undefined behavior, so executing the destructors should be conforming, I'd think. – Kyle Strand Apr 25 '19 at 22:10
  • @KyleStrand: Some people view the Standard as a complete definition of the language they call C, and view all programs as either being "portable" or "non-portable", with nothing in-between. In their eyes, even if every C implementation ever produced to date has processed some construct in some fashion that is so obviously useful the authors of the Standard thought implementers would support it without being ordered to do so, that would be no reason for programmers to expect that future implementations would continue to do likewise. – supercat May 01 '19 at 15:14
2

Relying on Implementation Documentation

The essential question here is whether we can rely on specifications provided by a C or C++ implementation. (Since we are dealing with a situation with mixed C and C++ code, I will refer to this combined implementation as a single implementation.)

In fact, we must rely on implementation documentation. The C and C++ standards do not apply unless and until an implementation asserts that it conforms (at least in part) to the standards. The standards have no power of law; they do not apply to any person or undertaking until somebody decides to adopt them. (The C 2018 Foreword refers to an ISO statement explaining the standards are voluntary.)

If an implementation tells you it conforms to the C and C++ standards, and it also tells you it supports throwing C++ exceptions through C code, there is no reason to believe one and not the other. If you accept the implementation’s documentation, then it both conforms to the language standard and supports throwing exceptions through C code. If you do not accept the implementation’s documentation, then there is no reason to expect conformance to the language standards. (This is a general view, neglecting instances where apparent bugs give us reason to doubt specific behaviors, for example.)

If you ask whether passing an exception through C code is “undefined” in the sense used in the C or C++ standards, the answer is yes. But those standards are only discussing what they define. Their use of “undefined” does not prohibit anybody else from defining behavior. In fact, if you are using an implementation’s documentation, you have a definition for the behavior. The C and C++ standards do not undo, negate, or nullify definitions made by other documents:

  • Where the C or C++ standard says any behavior is undefined that only means the behavior is undefined within the context of the C or C++ standard.
  • Any other specification a programmer chooses to use may define additional behavior that is not defined by the C or C++ standard. The C and C++ standards do not prohibit this.

Example

As an example, some of the documents one might rely on to specify the behavior of a commercial software product include:

  • The C standard.
  • The C++ standard.
  • The assembler manual.
  • The compiler documentation.
  • Apple’s Developer Tools documentation, include behaviors of Xcode, the linker, and other tools used during a software build.
  • Processor manuals.
  • Instruction set architecture specifications.
  • IEEE-754 Standard for Floating Point Arithmetic.
  • Unix documentation for command-line tools.
  • Unix documentation for system interfaces.

For much software, it would be impossible to produce the software if the overall behavior were not defined by all these specifications combined. The notion that the C or C++ standard overrides or trumps other documentation is ludicrous.

Writing Portable Code

Any software project, or any engineering project, works from premises: It takes as given various tool specifications, material properties, device properties, and so on, and it derives desired products from those premises. Rarely does any complete end-user commercial product rely solely on the C or C++ standard. When you buy an iPhone, it obeys the laws of physics, and you are entitled to rely on it to conform to safety specifications for electrical devices and to radio frequency behaviors regulated by governmental agencies. It conforms to many specifications, and the notion that the C standard should be regarding as trumping those other specifications is absurd. If your device burst into flame because of a programming error that the C standard says has undefined behavior, that is not acceptable—the fact the C standard says it is not defined does not trump the safety specification.

Even in purely software projects, very few strictly conform to the C or C++ standards. Largely, only software that does some pure computations and limited input/output can be written in strictly conforming C or C++. That can include very useful libraries that are included in other software, but it includes very few complete commercial end-user programs—such as a few things used by mathematicians and scientists to answer questions about logic, math, and modeling, for example. Most software in this world interacts with devices and operating systems in ways not defined by the C or C++ standards. Most software uses extensions not defined by the standards—extensions that manipulate files and memory in ways not defined by the standards, that interact with devices and users in ways not defined by the standards. They display GUI windows and accept mouse and keyboard input from the user. They transmit and receive data over a network. They send radio waves to other devices.

These things are impossible without using behaviors not defined by the language standards. And, if the language standards trumped the definitions of these behaviors, writing such software would be impossible. If you wanted to send a Wi-Fi radio signal, and you had adopted the C standard, and the C standard trumped other definitions, that would mean it would be impossible for you to write software that reliable sends a radio signal. Obviously, that is not true. The C standard does not trump other specifications.

Writing “portable code” is not a feasible requirement for most software projects. It is, of course, desirable to contain non-portable code to clear interfaces. It is desirable to write what code one can using portable code so that it can be reused. But this is only part of most projects. For most projects, the project as a whole must use behaviors defined by documents other than the language standards.

Community
  • 1
  • 1
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
1

The code is not UB because the code is not in C++ language, the code is in C++ with gcc/clang extensions language. In C++ with gcc/clang extensions the code is documented and well defined. In C++ the same code would be UB.

So if you take the same code and compile it in pure standard C++ then that code would exhibit UB. But if you compile it in C++ with gcc/clang extensions then the code is well defined.

bolov
  • 72,283
  • 15
  • 145
  • 224