20

If you have a C file, compiled with a C compiler and have defined behavior for C but not C++, can you link it with a C++ file and not have undefined behavior?

in blah.c (the file compiled as C)

struct x {
    int blah;
    char buf[];
};

extern char * get_buf(struct x * base);
extern struct x * make_struct(int blah, int size);

blah_if.h

extern "C"
{
    struct x;

    char * get_buf(struct x * base);
    struct x * make_struct(int blah, int size);
}

some_random.cpp (Compiled with a C++ compiler)

#include "blah_if.h"

...

x * data=make_struct(7, 12);
std::strcpy(get_buf(data), "hello");

Is using the defined behavior in C's flexible array member in a file compiled with a C compiler, defined behavior when used by a file compiled as C++ and linked with the object from the C compiler?

Note that because a C compiler is used and struct x is opaque, this is different than:

Does extern C with C++ avoid undefined behavior that is legal in C but not C++?

Community
  • 1
  • 1
Glenn Teitelbaum
  • 10,108
  • 3
  • 36
  • 80
  • As long as well defined source code went into the relevant compiler you're fine linking the object code. – Galik Aug 07 '15 at 16:45
  • 1
    @GlennTeitelbaum You should have improved your original question in that direction, instead of posting a new one. – πάντα ῥεῖ Aug 07 '15 at 16:46
  • @πάνταῥεῖ Can you and Drew Dormann please discuss, He said leave that question as extern C, you are saying edit thart question to be linkage There are two questions, extern "C" and linkage, and drew felt I asked the former and should leave it – Glenn Teitelbaum Aug 07 '15 at 16:48
  • 2
    I for one think it should be two separate questions as he is asking about two techniques. – NathanOliver Aug 07 '15 at 16:49
  • @this I'm actually doing so, you may vote to reopen if you like. – πάντα ῥεῖ Aug 07 '15 at 16:49
  • 1
    @πάνταῥεῖ I did suggest the action that Glenn took. If you're interested, see my comments on the linked question. – Drew Dormann Aug 07 '15 at 16:51
  • @GlennTeitelbaum It's about linkage primarily. See this related: http://stackoverflow.com/questions/18877437/undefined-reference-to-errors-when-linking-static-c-library-with-c-code/18879053#18879053 – πάντα ῥεῖ Aug 07 '15 at 16:51
  • @πάνταῥεῖ no its about UB and its relation to linkage, not name mangling – Glenn Teitelbaum Aug 07 '15 at 16:53
  • @GlennTeitelbaum `extern "C" ` requires unmangled c symbols, so you can be sure that it was compiled using plain c. If the c Compiler, that was used to compile that code, guarantees the behavior is defined, it will still be defined when linked to c++ code that would consider such statement UB. – πάντα ῥεῖ Aug 07 '15 at 16:59

2 Answers2

19

The behavior is implementation-defined.

[dcl.link] Linkage from C++ to objects defined in other languages and to objects defined in C++ from other languages is implementation-defined and language-dependent.

It continues:

Only where the object layout strategies of two language implementations are similar enough can such linkage be achieved.

That sentence in the standard really should be an annotation, since it doesn't specify what counts as "similar enough".

Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
  • 1
    I think this is the correct answer, and at least the safest, but I can't imagine how the subroutine as compiled by the C compiler could break just because the entry was from c++ (with extern 'C' to use C calling conventions etc.) – Grady Player Aug 07 '15 at 17:00
  • 1
    odd that the language supports `extern C` for linkage, but doesn't count C as a defined linkage – Glenn Teitelbaum Aug 07 '15 at 17:07
  • 1
    I'm not sure this quote from the spec is relevant. I see linkage from C++ to functions written in C. The C++ spec requires this to work. I don't see any objects being linked. – arx Aug 07 '15 at 21:50
  • @arx do you have the part of the spec that requires linkage from C++ to functions written in C to work, if so, that would be a good answer – Glenn Teitelbaum Aug 20 '15 at 14:39
  • 1
    Even if functions are exempt from the quoted section of [dcl.link], I don't see any requirement that `struct x*` in C and `struct x*` in C++ be identically-represented. – Raymond Chen Aug 20 '15 at 15:24
12

As Raymond has already said, this is implementation-defined at the formal, language level.

But it's important to remember what your compiled code is. It's not C++ code any more, nor is it C code. The rules about the behaviour of code in those languages apply to code written in those languages. They are taken into consideration during the parsing and translation process. But, once your code has been translated into assembly, or machine code, or whatever else you translated it to, those rules no longer apply.

So it's effectively meaningless to ask whether compiled C code has UB. If you had a well-defined C program, and compiled it, that's that. You're out of the realm of being able to discuss whether the compiled program is well-defined or not. It's a meaningless distinction, unless you have somehow managed to generate a program that is dictated to have UB by the specification for your assembly or machine language dialect.

The upshot of all this is that the premise of your question is unsound. You can't "avoid undefined behaviour" when linking to the compiled result of a C program, because the very notion of "undefined behaviour" does not exist there. But, as long as the original source code was well-defined when you translated it, you will be fine.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Eloquently put. Did you mean to imply that there can be no problems once you have reached the target language? Linking between different calling conventions probably qualifies as a variety of undefined behaviour – Jon Chesterfield Aug 07 '15 at 17:25
  • 8
    This describes one model of compilation, but not the only one. Another model is link-time code generation - under this model, source code gets compiled to an intermediate language, and it is at link time that actual assembly code gets generated. That intermediate language could have UB. – Raymond Chen Aug 07 '15 at 18:13
  • @RaymondChen: Where is this model defined, and is it within the scope of either C or C++? – Lightness Races in Orbit Aug 07 '15 at 18:39
  • 8
    @LightnessRacesinOrbit The standard does not address when code generation occurs. It is an implementation detail. Both [gcc](http://gcc.gnu.org/wiki/LinkTimeOptimization) and [Visual Studio](https://msdn.microsoft.com/en-us/library/xbf3tbeh(VS.80).aspx) support it. It sometimes goes by the name "whole program optimization". – Raymond Chen Aug 07 '15 at 19:04