2

I've distilled my problem down to a (hopefully) very simple example. At a high level, I have a shared library which provides a class implementation, and a main executable which uses the library. In my example, the library is then extended with CPPFLAG=-DMORE so that the class initializer list now has one additional member. Since the ABI signature of the library does not changed, there should be no need to recompile the executable. Yet, in my case, I get a coredump. I do not understand why this is an issue. Can someone please point out where I am going wrong?

Environment

Linux, amd64, gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

Setup

Using the code provided below, do the following:

  1. make clean
  2. make main (which also builds base-orig version of the library)
  3. ./main which runs just fine
  4. make base_more
  5. ./main which crashes with
    Base hello
    Base class constructor has non-null MORE
    Base goodbye
    Base class destructor has non-null MORE
    *** stack smashing detected ***: terminated
    Aborted (core dumped)
    

Code

library header (base.h)

#ifdef MORE
    #include <functional>
#endif

class base
{
  public:
    base();
    ~base();

  private:

#ifdef MORE
    std::function<void()> more_;
#endif
};

library source (base.cpp)

#include "base.h"
#include <iostream>

#ifdef MORE
void hi()
{
  std::cout << "Hello from MORE" << std::endl;
}
#endif

base::base()
#ifdef MORE
   : more_(std::bind(&hi))
#endif

{
  std::cout << "Base hello " << std::endl;

#ifdef MORE
  if (nullptr != more_)
  {
    std::cout << "Base class constructor has non-null MORE" << std::endl;
  }
#endif
}

base::~base()
{
  std::cout << "Base goodbye " << std::endl;

#ifdef MORE
  if (nullptr != more_)
  {
    std::cout << "Base class destructor has non-null MORE" << std::endl;
  }
#endif
}

Executable (main.cpp)

#include "base.h"

int main()
{
  base x;
}

Makefile

base_orig:
        g++ -O0 -g -fPIC -shared -olibbase.so base.cpp
        objdump -C -S -d libbase.so > orig.objdump

base_more:
        g++ -O0 -g -DMORE -fPIC -shared -olibbase.so base.cpp
        objdump -C -S -d libbase.so > more.objdump

main: base_orig
        g++ -O0 -g -Wextra -Werror main.cpp -o main -L. -Wl,-rpath=. -lbase
        objdump -C -S -d main > main.objdump

clean:
        rm -f main libbase.so

I tried to go through the objdump output to figure out why the stack is getting corrupted, but alas, my knowledge of amd64 assembly is rather weak.

Paul Grinberg
  • 1,184
  • 14
  • 37
  • 3
    The program breaks the [One Definition Rule](https://en.cppreference.com/w/cpp/language/definition#One_Definition_Rule) _"...One and only one definition of every non-inline function or variable that is odr-used (see below) is required to appear in the entire program (including any standard and user-defined libraries). The compiler is not required to diagnose this violation, but the behavior of the program that violates it is undefined...."_ – Richard Critten Mar 10 '23 at 21:42
  • 1
    Specifically your conditional defining of `MORE` is causing `base` to have multiple different definitions – Drew Dormann Mar 10 '23 at 21:54
  • @RichardCritten - I do not understand how a private class member change impacts the class, with regards to ODR. The is is no public facing change. How does the main executable even know that there is a difference? – Paul Grinberg Mar 10 '23 at 21:58
  • @DrewDormann - same comment as above. Can you please clarify how the main executable knows anything about private members of a class? According to `objdump` output of `main`, there doesn't appear to be any linked content – Paul Grinberg Mar 10 '23 at 22:06
  • 4
    @PaulGrinberg in general, C++ depends on an object having one definition. More specifically, in your case, C++ depends on the compiler knowing the **byte size** of any object. Your two different definitions of `base` produce two contradictory values for `sizeof(base)`. – Drew Dormann Mar 10 '23 at 22:09
  • Further, should destroying a `base` destroy a `std::function`? Your program has no idea. It might try, it might not. It has conflicting information on what it should do. – Drew Dormann Mar 10 '23 at 22:16
  • @DrewDormann - I can see how the *byte size* of the class on the heap/stack would change. But what I am not understanding is why that heap/stack allocation is happening in the `main` as opposed to the library? In other words, what I think I am missing is exactly where in the process of constructing a new object is the memory allocated? – Paul Grinberg Mar 10 '23 at 22:18
  • 1
    It's [Undefined Behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior). And you are asking how exactly this Undefined Behavior is behaving for you. – Drew Dormann Mar 10 '23 at 22:20
  • 1
    C++ has so-called [as-if rule](https://en.cppreference.com/w/cpp/language/as_if), which means that as long as the observable behavior of a (correct) program is the same, the compiler is allowed to do (or not do) anything. Memory allocation, construction of an object and its sub-objects, and so on, can be done in any bizarre way deemed optimal by the compiler, as long as it doesn't change the observable behavior. But here's the catch: if the program has UB (such as when object's layout turns out to be different when it is expected to be the same), the as-if tricks may fail in spectacular ways. – heap underrun Mar 10 '23 at 22:35

2 Answers2

3

You're trying to fit a probably 24 or 32 byte std::function member into a 1-byte empty class. There simply isn't enough space to hold it.

When you say base x; in main, main does two things:

  1. Reserve enough memory to hold a base object
  2. Pass a pointer to that memory to base's constructor

Since MORE wasn't defined when you compiled main, as far as it is concerned, base has no data members. Therefore it will only reserve 1 byte of memory (since every object needs a unique address, even if it's empty). It then passes a pointer to that 1 byte of memory to base's constructor, which is located in your dynamically-loaded library. Since MORE was defined when that library was compiled, it thinks a base object has one std::function member and will try to initialize that member in the memory that main passed it a pointer to. There isn't enough space there, and so it ends up initializing more_ in memory that was in used by something else.

Remember, a pointer contains no information about how much memory is available where it points, so base's constructor must assume that it was passed a pointer to enough memory to hold a base object. That means that main and base's constructor need to agree on how big a base object is.


The way to avoid this issue is to avoid passing actual objects across library boundaries and only ever pass pointers.

That is, you can make base's constructor private and add a static function std::unique_ptr<base> make_base(). That way it becomes the sole responsibility of the library to allocate memory for base objects, and you can never encounter this situation where the main program and the library disagree on how much memory is needed to hold a base. This does, of course, come with some overhead, since it requires that all base objects be dynamically-allocated. It's also important to make sure the main program and library are compiled using the same compiler and C++ standard library so that you can make they agree on how big any standard library types that you do pass across the library boundary are (such as std::unique_ptr or std::string).

Miles Budnek
  • 28,216
  • 2
  • 35
  • 52
2

The base.h header included from main.cpp looks different (has different size) comparing to the same header included by base.cpp which at that point has also the more_ member defined.

tomaszmi
  • 443
  • 2
  • 6
  • I apparently am missing something fundamental about C++. How does a private member have any impact on the public API of the class? When compiled into the executable, there should be no linker reference to any private members in the class. I believe that is what I see in `objdump` output of `main`. Where is the fallacy in my thinking? – Paul Grinberg Mar 10 '23 at 22:05
  • When main allocates memory for `x` or only allocates enough stack memory for an object with no members (typically 1 byte) but the constructor of `base`expects there to be space for the members so it writes outside of the space allocated in main, hence your stack smashing error. Read about the "one definition rule", your code violates it and therefore has undefined behaviour – Alan Birtles Mar 10 '23 at 22:15
  • 2
    "*How does a private member have any impact on the public API of the class?*" - it doesn't. But it *does* affect the byte size and member layout in memory of any instances of that class. – Remy Lebeau Mar 10 '23 at 22:18