11

I have a *.cpp file that I compile with C++ (not a C compiler). The containing function relies on a cast (see last line) which seems to be defined in C (please correct if I am wrong!), but not in C++ for this special type.

[...] C++ code [...]

struct sockaddr_in sa = {0};
int sockfd = ...;
sa.sin_family = AF_INET;
sa.sin_port = htons(port);
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

[...] C++ code [...]

Since I compile this in a C++ file, is this now defined or undefined behaviour? Or would I need to move this into a *.c file, to make it defined behaviour?

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Daniel Stephens
  • 2,371
  • 8
  • 34
  • 86
  • 1
    The file extension has no meaning; only if you compile it as a C or C++. – Fredrik Oct 05 '19 at 16:43
  • It was used to imply which compiler I used, in this case C++ – Daniel Stephens Oct 05 '19 at 16:46
  • https://stackoverflow.com/questions/3178342/compiling-a-c-program-with-gcc , see this. Othewise for sure if your compiler accepts C++ files and compiles them as if they are C files, they are as good as C files and vice-versa. So yes, it completely depends on how your compiler treats the file you supplied it, you will probably have to mention the compiler you use to get concise answers. – Mihir Luthra Oct 05 '19 at 16:49
  • In my example the function is used in a `*.cpp` file, so it is surrounded by C++ code and therefore in C++ context – Daniel Stephens Oct 05 '19 at 16:51
  • I don't see why it would not work in C++. The cast is an object-inheritance-like trick in C. – Jean-François Fabre Oct 05 '19 at 16:52
  • 1
    The corresponding types are not inherited from each other and have no relationship and that is undefined in C++ – Daniel Stephens Oct 05 '19 at 16:52
  • 1
    Typically, if the file has `.c` extension, C compiler is invoked automatically. – Igor R. Oct 05 '19 at 16:57
  • Correct, my code is in a C++ file. The question would be, how this will be counted if this is surrounded by C++ code – Daniel Stephens Oct 05 '19 at 16:58
  • 1
    I do that trick all the time in C++ code. No idea why it won't work for you. Missing a header somewhere? – user4581301 Oct 05 '19 at 17:00
  • It does work, I am just wondering if that `officially` would be UB or not – Daniel Stephens Oct 05 '19 at 17:01
  • There is no UB in this C++ code. That cast is valid. – user7860670 Oct 05 '19 at 17:02
  • @VTT this is UB in C++ no? (see https://timsong-cpp.github.io/cppwp/basic.lval#11) – Daniel Stephens Oct 05 '19 at 17:04
  • 3
    @DanielStephens This program never tries to dereference pointer. [Cast itself is allowed](https://timsong-cpp.github.io/cppwp/expr.reinterpret.cast#7), dereferencing - only sometimes. If C side correctly casts pointer back to real type then everything should be fine. Problems with casting would may occur if those types had different alignment requirements. – user7860670 Oct 05 '19 at 17:05
  • @user4581301 there are a lot of idioms that were popular long ago that continue to work, even though they are officially undefined behavior. Because there was so much old code relying on them, the compiler writers try to keep them working even without guarantees. That doesn't make them good practice. – Mark Ransom Oct 05 '19 at 17:08
  • @StoryTeller I think these objects are not pointer-interconvertible, they are not related at all and are not subobjects. – user7860670 Oct 05 '19 at 17:11
  • @VTT - Won't that depend on how exactly the implementation defines them? – StoryTeller - Unslander Monica Oct 05 '19 at 17:13
  • @StoryTeller Yes, but none of the common implementations define them in such manner. – user7860670 Oct 05 '19 at 17:23
  • 1
    It's C++ code, in a C++ source file, compiled as C++. What does C have to do with it? And why do you think this is UB in C++? – Lightness Races in Orbit Oct 05 '19 at 17:23
  • I have misinterpreted the question. This question isn't about whether the code compiles or not, but whether it's strictly legal. I'm getting the smurf out of here. – user4581301 Oct 05 '19 at 17:24
  • @LightnessRacesinOrbit Because it's a function call from C times used within C++ context. Context matters ;-) And I expected it to be UB since both types are unrelated – Daniel Stephens Oct 05 '19 at 17:35
  • 1
    The historical ancestry of the functions used in your code doesn't appear to be relevant. If your code is C++, your code is C++. Period, full stop. The second point is far more interesting, and I also wonder about this: personally I believe a common prefix makes the cast valid, but I have yet to verify that. – Lightness Races in Orbit Oct 06 '19 at 01:04
  • The cast itself is not undefined behavior [accessing the value would violate strict aliasing](https://stackoverflow.com/a/51228315/1708801) bind is part of the implementation. – Shafik Yaghmour Oct 06 '19 at 02:08
  • You should write code converting from struct sockaddr_in to struct sockaddr; no pointer manipulation. – drowa Nov 26 '19 at 06:18

2 Answers2

7

This is defined in both C++ and C. It does not violate strict aliasing regulations as it does not dereference the resulting pointer.

Here's the quote from C++ (thanks to @interjay and @VTT) that allows this:

An object pointer can be explicitly converted to an object pointer of a different type.

Here's the quote from C (thanks @StoryTeller) that allows this:

A pointer to an object type may be converted to a pointer to a different object type.

These specify that one pointer type can be converted to another pointer type (and then optionally converted back) without consequence.

And here's the quote from POSIX that allows this specific case:

The sockaddr_in structure is used to store addresses for the Internet address family. Pointers to this type shall be cast by applications to struct sockaddr * for use with socket functions.

As this function (bind) is part of the C standard library, whatever goes on inside (specifically, dereferencing the type-casted pointer) does not have undefined behavior.


To answer the more general question:

C and C++ are two different languages. If something is defined in C but not in C++, it's defined in C but not in C++. No implied compatibility between the two languages will change that. If you want to use code that is well-defined in C but is undefined in C++, you'll have to use a C compiler to compile that code.

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
  • Thanks for your answer! Do you have a link which refers to "defined in POSIX"? – Daniel Stephens Oct 05 '19 at 17:05
  • What exactly is undefined here? – interjay Oct 05 '19 at 17:09
  • @interjay The cast from the `struct sockaddr_in *` to the `struct sockaddr *` violates strict aliasing in C++. – S.S. Anne Oct 05 '19 at 17:10
  • 1
    Casting is allowed. Strict aliasing only prevents dereferencing after the cast, but this isn't done here. – interjay Oct 05 '19 at 17:10
  • 1
    @interjay Can you back that up with a quote from the standard (either C or C++ is fine)? – S.S. Anne Oct 05 '19 at 17:11
  • The POSIX reference would be better linked to [clause 2.4](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_04), which says “POSIX.1-2017 is currently specified in terms of the shell command language and ISO C…” – Eric Postpischil Oct 05 '19 at 17:13
  • @JL2210 Here's a reference posted by VTT above: https://timsong-cpp.github.io/cppwp/expr.reinterpret.cast#7. Can *you* back up the claim that this is undefined? – interjay Oct 05 '19 at 17:14
  • The fact that the C++ standard does not define behavior does not prevent a C++ implementation from defining it. I believe GCC has some support for aliasing, via command-line switches, regardless of the C++ standard. So the statement “This must be compiled as C with a POSIX-conforming C implementation for it to be defined” is dubious. It can be compiled with a C++ compiler that supports aliasing. – Eric Postpischil Oct 05 '19 at 17:17
  • @ericpostpischil Wouldn't that be implementation defined behavior that's left up to the compiler. – Hatted Rooster Oct 05 '19 at 17:19
  • @interjay I assume in `bind()` the pointer is dereferenced to be read though. – Hatted Rooster Oct 05 '19 at 17:20
  • Having trouble finding the quote for this being defined in C. Can anyone help? – S.S. Anne Oct 05 '19 at 17:20
  • @SombreroChicken But `bind` is presumably not implemented in C++. The fact that the calling code is C++ doesn't limit what `bind` can do. – interjay Oct 05 '19 at 17:23
  • @JL2210 I'm not convinced it violates strict aliasing; there's a common prefix, no? – Lightness Races in Orbit Oct 05 '19 at 17:24
  • @SombreroChicken: In the C and C++ standards, “implementation defined” means something that **must be** defined by the implementation, not just something that **is** defined by the implementation. If the behavior is “undefined” by the standard, an implementation does not have to do anything with it, but it **may** define it. – Eric Postpischil Oct 05 '19 at 17:26
  • 1
    There you go https://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7 – StoryTeller - Unslander Monica Oct 05 '19 at 17:26
  • @StoryTeller that domain is giving me a certificate error, I am spooked. It can't StoryTell now, mind finding a better host for that quote or writing it down here? – Hatted Rooster Oct 05 '19 at 17:27
  • @SombreroChicken - Really? That's bad. I have way too many answers to fix if that's the case O_O – StoryTeller - Unslander Monica Oct 05 '19 at 17:28
  • 1
    @StoryTeller Thanks. Added the HTTP version instead, though. – S.S. Anne Oct 05 '19 at 17:28
  • @StoryTeller I can contact the owner of the page on the #musl IRC to have them update their certificates. It shouldn't be that hard. – S.S. Anne Oct 05 '19 at 17:29
  • @JL2210 Thanks for updating your answer! You mention `as it does not dereference the resulting pointer`. Does that imply dereferencing it on the `C++`-side would make it UB then? – Daniel Stephens Oct 05 '19 at 17:43
  • I am not sure it stops there. Initiated by a recent CppCon talk and referring to your quoted quote `An object pointer can be explicitly converted to an object pointer of a different type.` the following example introduces two classes who's alignment are not different than the other (that's how the quote continues). But this is UB and visible if Clang and GCC are compared https://gcc.godbolt.org/z/pvBYmS – Daniel Stephens Oct 05 '19 at 17:49
  • @DanielStephens Yes. – S.S. Anne Oct 05 '19 at 17:49
  • 2
    Arguably this is a defect in the POSIX standard -- the second argument to `bind` should be a `const void *`, but `bind` predates the existence of `void` in the C language (and the existence of C++ at all). They updated it at some point to add the `const`, but never fixed the basic type. – Chris Dodd Oct 05 '19 at 17:50
  • That explains then why the example in my comment above is UB. hm.. – Daniel Stephens Oct 05 '19 at 17:50
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/200455/discussion-on-answer-by-jl2210-what-happens-if-undefined-c-behaviour-meets-c-d). – Samuel Liew Oct 05 '19 at 23:39
  • @LightnessRacesinOrbit What exactly do you want to be addressed? – S.S. Anne Oct 06 '19 at 01:06
  • @JL2210 Whether (and why) the cast is valid per strict aliasing rules. – Lightness Races in Orbit Oct 06 '19 at 01:08
  • @DanielStephens Is there anything I can do to improve this answer so you can re-accept it? – S.S. Anne Oct 06 '19 at 01:29
  • @LightnessRacesinOrbit So... What should I change to address that? – S.S. Anne Oct 06 '19 at 01:31
  • @JL2210 Say whether it is, and explain why – Lightness Races in Orbit Oct 06 '19 at 02:16
  • @DanielStephens No problem. Done it myself a couple of times. – S.S. Anne Oct 06 '19 at 02:21
  • @LightnessRacesinOrbit Is this good? Any other details that it might be a good idea to include? – S.S. Anne Oct 06 '19 at 02:22
  • 1
    I can recommend this talk https://www.youtube.com/watch?v=_qzMpk-22cc I am still not sure if it validates the answer, or if it is actually saying the opposite :-o C++ is a nightmare.. lol – Daniel Stephens Oct 06 '19 at 02:22
  • Alright, [looks like the rule I was remembering does exist, but only applies to unions](https://stackoverflow.com/a/20789178/560648). So it _is_ UB here. Whoops, I use this trick all the time! (Because it's common from C) Probably fine in practice though. You're not wrong to say that the cast itself is fine regardless, but I think the spirit of the question (the pointer _is_ going to be dereferenced!) demands a little exploration of the aliasing problem. – Lightness Races in Orbit Oct 06 '19 at 02:23
  • @LightnessRacesinOrbit This is slightly different. The resulting pointer is not dereferenced outside of the standard C library, so somehow it's defined. – S.S. Anne Oct 06 '19 at 02:24
  • Good point! That should also be discussed in the answer (which should provide standard quotes to support it) – Lightness Races in Orbit Oct 06 '19 at 02:26
  • @LightnessRacesinOrbit My standard-searching capabilities are not up-to-par with the rest of Stack Overflow. Any suggestions? – S.S. Anne Oct 06 '19 at 02:28
-4

Calls between C and C++ code all invoke Undefined Behavior, from the point of view of the respective standards, but most platforms specify such things.

In situations where parts of the C or C++ Standard and an implementation's documentation together define or describe an action, but other parts characterize it as Undefined, implementations are allowed to process code in whatever fashion would best serve their customers' needs or--if they are indifferent to customer needs--whatever fashion they see fit. The fact that the Standard regards such matters as outside their jurisdiction does not imply any judgment as to when and/or how implementations claiming suitability for various purposes should be expected to process them meaningfully, but some compiler maintainers subscribe to a myth that it does.

supercat
  • 77,689
  • 9
  • 166
  • 211