45

I've seen a lot of discussion about how C++ doesn't have a Standard ABI quite in the same way that C does. I'm curious as to what, exactly, the issues are. So far, I've come up with

  1. Name mangling
  2. Exception handling
  3. RTTI

Are there any other ABI issues pertaining to C++?

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • 2
    I imagine the basic class layout isn't specified, so a class library compiled with one compiler may not be usable with another compiler (e.g. with virtually inheriting classes). – Kerrek SB Sep 20 '11 at 21:54
  • 4
    The Windows vs. Unix `wchar_t` is kind of annoying, but I'm not sure how to categorize it :-) – Šimon Tóth Sep 20 '11 at 21:56
  • There are so many acronyms floating around you may want to define ABI. – C.J. Sep 20 '11 at 21:58
  • 1
    @Let_Me_Be: That's probably not relevant, because you don't expect the ABI to promise any sort of cross-platform compatibility. – Kerrek SB Sep 20 '11 at 21:59
  • 1
    @KerrekSB Well, the wide character functions from the C part of the C++ library don't work on Windows. – Šimon Tóth Sep 20 '11 at 22:02
  • @BenVoigt Well anything that takes `wchar_t` as a parameter obviously. Functions taking `wchar_t*` should be OK. – Šimon Tóth Sep 20 '11 at 22:36
  • 3
    @Let_Me_Be: No, it isn't obvious. Please give an example of a "function from the C part of the C++ library" that "doesn't work on Windows". – Ben Voigt Sep 20 '11 at 22:48
  • 1
    @BenVoigt OMG, not this conversation again. Window 2000 was the last system which used non-variable encoding for `wchar_t` (UCS2), more recent versions use UTF-16 encoding which has surrogate pairs, therefore functions taking `wchar_t` can't work, since they take only one `wchar_t` not a pair, plus there is no standard way to determine that the `wchar_t` is single character/first/second part of surrogate pair. – Šimon Tóth Sep 20 '11 at 22:55
  • 1
    @Let_Me_Be: According to that logic, functions that take `char` can't work on Linux (since not every character fits in a single `char`, often it's a UTF-8-encoded string of octets). Anyway, this has NOTHING to do with binary compatibility across modules, so I don't know why you brought it up. – Ben Voigt Sep 20 '11 at 23:43
  • 1
    @BenVoigt I'm sorry but `char` definitely won't be UTF-8 encoded. If you are reading Unicode input, you need to store it in wide (`wchar_t`) strings, not `char` strings. Unless you are of course reading/storing raw data, in which case it is kind of irrelevant what the underlying type is. And I brought it up because in this case the standard isn't clear enough to explicitly forbid variable length encoding for `wchar_t`, although this fact is implied on several places. It has to do with binary compatibility across modules, since this is one of the breaking points of MSVC vs GCC on Windows. – Šimon Tóth Sep 21 '11 at 05:26
  • 3
    @Let_Me_Be: What the hell are you talking about? Linux uses UTF-8 strings. – Puppy Sep 21 '11 at 12:11
  • 1
    @DeadMG No, wide strings are UTF-32 on Linux, narrow strings are ASCII. – Šimon Tóth Sep 21 '11 at 12:42
  • @Let_Me_Be: Linux uses UTF-8 strings practically EVERYWHERE. For example, `open` and `creat`. But this is totally off-topic for this question. If you want to explore it further, provide a link to a relevant question or create a new one. – Ben Voigt Sep 21 '11 at 13:27
  • 1
    @BenVoigt Well, you should read that link yourself. Functions you mentioned are actually completely encoding agnostic. Plus they are not part of the C (or C++) standard and are therefore completely irrelevant in this discussion. – Šimon Tóth Sep 21 '11 at 13:33
  • @Let_Me_Be: I don't see a link. No, `open` and `creat` are POSIX, not standard C. But `fopen` is standard C (and standard C++). So is `isalpha`. Windows is no more broken than Linux in this regard. The only reason you don't perceive Linux as broken is because you never use any compiler except gcc. – Ben Voigt Sep 21 '11 at 13:40
  • 2
    @BenVoigt So according to you `fopen` expects UTF-8 strings? That just WOW. Here is the link: http://stackoverflow.com/q/7500902/211659 And BTW I use MSVC, Intel and GCC, since I teach C and C++ and therefore need to know how functional each of these major compilers is. – Šimon Tóth Sep 21 '11 at 13:52
  • @Let_Me_Be: MSVC doesn't run on Linux. What two different compilers do you use **on Linux**, that makes you knowledgeable about interoperability between Linux compilers? – Ben Voigt Sep 21 '11 at 13:55
  • 3
    let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/3662/discussion-between-let-me-be-and-ben-voigt) – Šimon Tóth Sep 21 '11 at 13:56

5 Answers5

43

Off the top of my head:

C++ Specific:

  • Where the 'this' parameter can be found.
  • How virtual functions are called
    • ie does it use a vtable or other
    • What is the layout of the structures used for implementing this.
  • How are multiple definitions handled
    • Multiple template instantiations
    • Inline functions that were not inlined.
  • Static Storage Duration Objects
    • How to handle creation (in the global scope)
    • How to handle creation of function local (how do you add it to the destructor list)
    • How to handle destruction (destroy in reverse order of creation)
  • You mention exceptions. But also how exceptions are handled outside main()
    • ie before or after main()

Generic.

  • Parameter passing locations
  • Return value location
  • Member alignment
  • Padding
  • Register usage (which registers are preserved which are scratch)
  • size of primitive types (such as int)
  • format of primitive types (Floating point format)
Martin York
  • 257,169
  • 86
  • 333
  • 562
  • All of these issues you listed are covered by existing C++ ABI:s, for example the ARM ABI. The "only" this that is not included in existing ABI:s is the layout of C++ standard library objects, as I pointed out in my earlier answer. – Lindydancer Sep 21 '11 at 05:51
  • There is no real world C++ implementation that doesn't use vtables. – curiousguy Jan 18 '19 at 21:27
24

The big problem, in my experience, is the C++ standard library. Even if you had an ABI that dictates how a class should be laid out, different compilers provide different implementations of standard objects like std::string and std::vector.

I'm not saying that it would not be possible to standardize the internal layout of C++ library objects, only that it has not been done before.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
Lindydancer
  • 25,428
  • 4
  • 49
  • 68
  • 3
    @Tux-D It is enough to change a compiler option in Visual Studio 2008 to get incompatible layouts of `std::vector`. – quant_dev Sep 20 '11 at 22:03
  • 1
    @Tux-D I think that he is hitting the fact, that STL types aren't opaque. – Šimon Tóth Sep 20 '11 at 22:06
  • That's not how I read it. He is implying that the STL objects have different interfaces. Which is not true. I quite agree that change any compiler flags can result in incompatible binary objects. – Martin York Sep 20 '11 at 22:10
  • 3
    @Tux-D: They do have different interfaces. When member functions get inlined, the interface includes layout of private objects, not just the public interface defined in the standard. Many implementation details are not specified by the standard. For one example: small-string optimizations. – Ben Voigt Sep 20 '11 at 22:23
  • 1
    @quant_dev: The debugging version of MSVC's Standard Library is not layout compatible with the regular version. Actually it's one of the goal of the new libc++ to have a compatible layout whatever the degree of compilation used (which requires external storage of information)... do you know of other situations that could affect the layout ? – Matthieu M. Sep 21 '11 at 07:38
10

The closest thing we have to a standard C++ ABI is the Itanium C++ ABI:

this document is written as a generic specification, to be usable by C++ > implementations on a variety of architectures. However, it does contain > processor-specific material for the Itanium 64-bit ABI, identified as such."

The GCC doc explains support of this ABI for C++:

Starting with GCC 3.2, GCC binary conventions for C++ are based on a written, vendor-neutral C++ ABI that was designed to be specific to 64-bit Itanium but also includes generic specifications that apply to any platform. This C++ ABI is also implemented by other compiler vendors on some platforms, notably GNU/Linux and BSD systems

As was pointed out by @Lindydancer, you need to use the same C++ standard libary/runtime as well.

ysdx
  • 8,889
  • 1
  • 38
  • 51
  • 4
    As @Lindydancer said, binary compatibility of C++ libraries on Linux has more to do with a single common and universally-used C++ runtime library (provided by g++) than the ABI. – Ben Voigt Sep 20 '11 at 22:25
  • 1
    @BenVoigt, true and one of the reasons why you see so many pointers in APIs to pass vector or string content for e.g. Now those pointers are getting competition from the likes of span, string_view, etc.... Do you know if there are efforts to make at least views compatible between different STL implementations? – Patrick Fromberg Sep 05 '19 at 14:21
  • @PatrickFromberg, yes there is a 2014 proposal from Herb Sutter http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4028.pdf – ysdx Sep 05 '19 at 19:59
5

An ABI standard for any language really needs to come from a given platform that wants to support such a thing. Language standards especially C/C++ really can not do this for many reasons but mostly because such a thing would make the language less flexible and less portable and therefore less used. C really doesn't have a defined ABI but many platforms define (directly or indirectly) one. The reason this isn't happening with C++ is because the language is much bigger and changes are made more often. However, Herb Sutter has a very interesting proposal about how to get more platforms to create standard ABIs and how developers can write code that uses the ABI in a standard way:

https://isocpp.org/blog/2014/05/n4028

He points out how C++ has a standard way to link into a platform C ABI but not a C++ ABI via extern "C". I think this proposal could go a long way to allowing interfaces to be defined in terms of C++ instead of C.

Rick Wildes
  • 141
  • 2
  • 4
1

I've seen a lot of discussion about how C++ doesn't have a Standard ABI quite in the same way that C does.

What standard C ABI? Appendix J in the C99 standard is 27 pages long. In addition to undefined behavior (and some implementations give some UB a well-defined behavior), it covers unspecified behavior, implementation-defined behavior, locale-specific behavior, and common extensions.

David Hammen
  • 32,454
  • 9
  • 60
  • 108