8

For API/ABI compatibility across many toolchains with the same binary, it is well known that STL containers, std::string, and other standard library classes like iostreams are verboten in public headers. (Exceptions to this are if one is distributing one build for each version of supported toolchains; one delivers source with no binaries for end-user compilation, which are not preferred options in the present case; or one translates to some other container inline so that a differing std implementation doesn't get ingested by the library.)

If one already had a published library API that did not follow this rule (asking for a friend), what is the best path forward while maintaining as much backwards compatibility as I reasonably can and favoring compile-time breakages where I can't? I need to support Windows and Linux.

Re the level of ABI compatibility I'm looking for: I don't need it to be insanely future-proof. I'm mainly looking to do just one library binary for multiple, popular Linux distros per release. (At present, I release one per compiler and sometimes special versions for a special distro (RHEL vs Debian). Same sort of concerns with MSVC versions -- one DLL for all supported MSVC versions would be ideal.) Secondarily, if I don't break the API in a bugfix release, I would like it to be ABI-compatible and a drop-in DLL/SO replacement without rebuilding the client application.

I have three cases with some tentative suggestions, modeled after Qt to a degree.

Old public API:

// Case 1: Non-virtual functions with containers
void Foo( const char* );
void Foo( const std::string& );

// Case 2: Virtual functions
class Bar
{
public:
    virtual ~Bar() = default;
    virtual void VirtFn( const std::string& );
};

// Case 3: Serialization
std::ostream& operator << ( std::ostream& os, const Bar& bar );

Case 1: Non-virtual functions with containers

In theory we can convert std::string uses to a class very much like std::string_view but under our library's API/ABI control. It will convert within our library header from a std::string so that the compiled library still accepts but is independent of the std::string implementation and is backwards compatible:

New API:

class MyStringView
{
public:
    MyStringView( const std::string& ) // Implicit and inline
    {
        // Convert, possibly copying
    }

    MyStringView( const char* ); // Implicit
    // ...   
};

void Foo( MyStringView ); // Ok! Mostly backwards compatible

Most client code that is not doing something abnormal like taking the address of Foo will work without modification. Likewise, we can create our own std::vector replacement, though it may incur a copying penalty in some cases.

Abseil's ToW #1 recommends starting at the util code and working up instead of starting at the API. Any other tips or pitfalls here?

Case 2: Virtual functions

But what about virtual functions? We break backwards compatibility if we change the signature. I suppose we could leave the old one in place with final to force breakage:

// Introduce base class for functions that need to be final
class BarBase
{
public:
    virtual ~BarBase() = default;
    virtual void VirtFn( const std::string& ) = 0;
};

class Bar : public BarBase
{
public:
    void VirtFn( const std::string& str ) final
    {
        VirtFn( MyStringView( str ) );
    }

    // Add new overload, also virtual
    virtual void VirtFn( MyStringView );
};

Now an override of the old virtual function will break at compile-time but calls with std::string will be automagically converted. Overrides should use the new version instead and will break at compile-time.

Any tips or pitfalls here?

Case 3: Serialization

I'm not sure what to do with iostreams. One option, at the risk of some inefficiency, is to define them inline and reroute them through strings:

MyString ToString( const Bar& ); // I control this, could be a virtual function in Bar if needed

// Here I publicly interact with a std object, so it must be inline in the header
inline std::ostream& operator << ( std::ostream& os, const Bar& bar )
{
    return os << ToString( bar );
}

If I made ToString() a virtual function, then I can iterate over all Bar objects and call the user's overrides because it only depends on MyString objects, which are defined in the header where they interact with std objects like the stream.

Thoughts, pitfalls?

metal
  • 6,202
  • 1
  • 34
  • 49
  • 13
    Why is it verboten? Why is writing your own classes that replicate the standard library a better thing? – user975989 Mar 01 '18 at 22:28
  • the most common thing i see is for people to use C level interfaces (char *,....) into c++ libraries. Ie they dont look like c++ at all. The fact that its c++ is transparent, you could recode it in c and nobody would know – pm100 Mar 01 '18 at 22:31
  • 2
    Neither C nor C++ have a standard ABI. And the "well-known" answer you link to has 3 upvotes - not exactly authoritative. –  Mar 01 '18 at 22:34
  • 7
    to quote from your line "Actually this is not only true for STL containers but applies to pretty much any C++ type (in particular also all other standard library types).". Ie dont expose c++ interfaces if you want ABI. Using your own C++ doesnt help – pm100 Mar 01 '18 at 22:38
  • @user975989 Why is it forbidden? Because of API/ABI compatibility. See this [Q&A](https://stackoverflow.com/a/21186451/201787) or [this talk at Meeting C++ 2017](https://www.youtube.com/watch?v=k9PLRAnnEmE). – metal Mar 01 '18 at 23:33
  • @pm100 But there are ways to maintain API/ABI compatibility in the way that, say, Qt does for Windows and Linux, right? They distribute one [set of binaries](https://www1.qt.io/offline-installers/) for each of these platforms, and they maintain their own containers like `QVector`, which do minimal interaction with the standard library and presumably a [holy buildbox approach for Linux](https://github.com/phusion/holy-build-box). – metal Mar 01 '18 at 23:43
  • Dobyou want ABI or API compatibility with std version? Nothing above maintains ABI. Are you ok withan ABI break, maintaining API? How insanely future proof do you want your new ABI to be? How much work are you willing to do to get a stable ABI? – Yakk - Adam Nevraumont Mar 01 '18 at 23:45
  • @Yakk I don't think I can get _std_ ABI compatibility, which is why I'm proposing my own versions of these classes. Then I control the API and ABI, no? Can you explain why I do not get ABI compatibility across, say, a wide subset of GCC versions 4.8-7 and clang 3.8-6 and MSVC 2012-2017 if I control the full interface, as with `MyStringView`? How does Qt do it? – metal Mar 01 '18 at 23:53
  • @metal MyStringView is not abi compatible with std string. Adding overloads to virtual methods is not abi compatible with much of anything, at least in MSVC. (Non overloaded members can be with lots of care). But you haven't answered my question. – Yakk - Adam Nevraumont Mar 02 '18 at 00:07
  • @Yakk Sorry, I should have been more specific about what kind of ABI compatibility I'm looking for. I don't need it to be insanely future proof. I'm mainly looking to do just one library binary for multiple, popular Linux distros per release. At present, I release one per compiler and sometimes special versions for a special distro (RHEL vs Debian). Same sort of concerns with MSVC versions -- one DLL for all supported MSVC versions would be ideal. Secondarily, if I don't break the API in a bugfix release, I would like it to be ABI-compatible and a drop-in DLL/SO replacement without rebuilding. – metal Mar 02 '18 at 00:13
  • @metal so **do you care if there is an abi break between the std string version and the first non-std-string version**? Yes or No. So an existing client of your library doesn't have to recomple. No is an answer. Please provide one, I am now asking the third time. – Yakk - Adam Nevraumont Mar 02 '18 at 00:19
  • @Yakk I had assumed there must be an API break, but I am trying to minimize it. If there is a way to do it without breaking the API, I would be open to that, but not at all costs. – metal Mar 02 '18 at 00:47
  • Did I say API? Why are ypu answerijg a question about **ABI** with an answer about **API**? **Do you care if there is an abi break between the std string version and the first non-std-string version**? Voting to close as unclear what you are asking, I give up trying to get clarification. 4 times. Sigh. – Yakk - Adam Nevraumont Mar 02 '18 at 03:08
  • @Yakk No I don't care if there must be an ABI break between the two. (I assumed above that an API break as described also entails an ABI break. I don't see how it could be otherwise, and I apologize that I was unclear.) – metal Mar 02 '18 at 03:23
  • Distribute sources, let clients compile, ABI solved. – Matthieu M. Mar 02 '18 at 07:18
  • Case 1: Just overload `Foo(const char*)` with an inline function `Foo(const std::string& arg) { Foo(arg.c_str()); }`. No need to go through an extra data type. – cmaster - reinstate monica Mar 02 '18 at 11:19
  • @MatthieuM. I'm basically just trying to match what Qt and others do in this regard -- distributing binaries that work widely and well (granted, Qt _also_ distributes as source so you can do it yourself in a pinch). – metal Mar 02 '18 at 13:40
  • 1
    Note that Qt has a pure C++ API and they succeed in guaranteeing ABI compatibility across major versions. Which takes care, but is definitely not impossible. They are ABI compatible within the boundaries of the underlying compiler/system. So each Visual Studio version needs a separate binary distribution (although I believe with VS2017 Microsoft has improved that situation). – rubenvb Mar 05 '18 at 12:11
  • @ArneVogel Really? Alas, you've reminded me that StackOverflow, while more high-minded than most sites, is still the internet where people impute motives and talk to others in a demeaning way that they never would employ in real life. There is a real question in your reply, but the rest of your comment makes it not worth my time to respond. – metal Mar 05 '18 at 13:25
  • Fine, I'll give you the benefit of the doubt and rephrase my comment: You start out saying that it is "well-known" that this was "verboten", conflating opinion with fact. The answer you link to doesn't even claim that (even if it did, it wouldn't prove your point), but only says that if you do include std classes, you have to provide different binaries for different implementations. The contentiousness was pointed out multiple times to you, yet you have not taken the time to rephrase this sentence, which may lead others to believe you are intentionally disrespectful. – Arne Vogel Mar 06 '18 at 06:56
  • 1
    @ArneVogel I tried to clarify and added more links. I had also added the third paragraph in response to previous comments from Yakk (who got the requested clarification and did provide a substantive answer). See what you think? Is that sufficient? The main issue for me is that I want to release a minimal set of libraries that work across a wide set of compiler / stdlib implementations. The different std implementations complicates things because my library can't ingest std objects from alternate implementations (e.g., GCC 4.8 vs. 5), but I want one lib to rule them all if possible, like Qt. – metal Mar 06 '18 at 14:09

2 Answers2

2

Tier 1

Use a good string view.

Don't use a std::string const& virtual overload; there is no reason for it. You are breaking ABI anyhow. Once they recompile, they'll see the new string-view based overload, unless they are taking and storing pointers to virtual functions.

To stream without going to intermediate string use continuation passing style:

void CPS_to_string( Bar const& bar, MyFunctionView< void( MyStringView ) > cps );

where cps is repeatedly called with partial buffers until object is serialized out it. Write << on top of that (inline in headers). There is some unavoidable overhead from function pointer indirection.

Now only use virtual in interfaces and never overload virtual methods and always add new methods at the end of the vtable. So don't expose complex heirarchies. Extending a vtable is ABI safe; adding to the middle is not.

FunctionView is a simple hand rolled non-owning std function clone whose state is a void* and a R(*)(void*,args&&...) which should be ABI stable to pass across library boundry.

template<class Sig>
struct FunctionView;

template<class R, class...Args>
struct FunctionView<R(Args...)> {
  FunctionView()=default;
  FunctionView(FunctionView const&)=default;
  FunctionView& operator=(FunctionView const&)=default;

  template<class F,
    std::enable_if_t<!std::is_same< std::decay_t<F>, FunctionView >{}, bool> = true,
    std::enable_if_t<std::is_convertible< std::result_of_t<F&(Args&&...)>, R>, bool> = true
  >
  FunctionView( F&& f ):
    ptr( std::addressof(f) ),
    f( [](void* ptr, Args&&...args)->R {
      return (*static_cast< std::remove_reference_t<F>* >(ptr))(std::forward<Args>(args)...);
    } )
  {}
private:
  void* ptr = 0;
  R(*f)(void*, Args&&...args) = 0;
};
template<class...Args>
struct FunctionView<void(Args...)> {
  FunctionView()=default;
  FunctionView(FunctionView const&)=default;
  FunctionView& operator=(FunctionView const&)=default;

  template<class F,
    std::enable_if_t<!std::is_same< std::decay_t<F>, FunctionView >{}, bool> = true
  >
  FunctionView( F&& f ):
    ptr( std::addressof(f) ),
    f( [](void* ptr, Args&&...args)->void {
      (*static_cast< std::remove_reference_t<F>* >(ptr))(std::forward<Args>(args)...);
    } )
  {}
private:
  void* ptr = 0;
  void(*f)(void*, Args&&...args) = 0;
};

this lets you pass generic callbacks over your API barrier.

// f can be called more than once, be prepared:
void ToString_CPS( Bar const& bar, FunctionView< void(MyStringView) > f );
inline std::ostream& operator<<( std::ostream& os, const Bar& bar )
{
  ToString_CPS( bar, [&](MyStringView str) {
    return os << str;
  });
  return os;
}

and implement ostream& << MyStringView const& in headers.


Tier 2

Forward every operation from a C++ API in headers to extern "C" pure-C functions (ie pass StringView as a pair of char const* ptrs). Export only an extern "C" set of symbols. Now symbol mangling changes no longer breaks ypur ABI.

C ABI is more stable than C++, and by forcing you to break library calls down into "C" calls you can make ABI breaking changes obvious. Use C++ header glue to make things clean, C to make ABI rock solid.

You can keep your pure virtual interfaces if you are willing to risk it; use the same rules as above (simple heirarchies, no overloads, only add to the end) and you'll get decent ABI stability.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
1

Containers

For accepting strings and arrays as function arguments use std::string_view and gsl::span respectively, or your own equivalents with a stable ABI. Non-contiguos containers can be passed in as ranges of any_iterator.

For returning by reference you can again use these classes.

For returning a sting by value you can return a std::string_view to a thread-local global object that is valid till the next API call (like std::ctime function). The user must make a deep copy if necessary.

For returning a container by value you can use a callback-based API. Your API is going to call the user callback for each element of the container being returned.

std::string_view, gsl::span and any_iterator or their equivalents must be implemented in header files that are shipped with your library to its users.

Virtual functions

You can use Pimpl idiom instead of classes with virtual functions in the API of your library.

Serialization

Can be implemented as functions in header files that use the public API of your library and serialize/deserialize using IOStreams.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271