Why it is a mistake to 'allocate in one library and free in the other'

Question

Google wrote in Android ndk guides site:

Memory allocated in one library, and freed in the other, causing memory leakage or heap corruption.

Why?
It's always correct?

EDIT

As @Galik wrote the context of this quote is:

In C++, it is not safe to define more than one copy of the same function or object in a single program. This is one aspect of the One Definition Rule present in the C++ standard.

When using a static runtime (and static libraries in general), it is easy to accidentally break this rule. For example, the following application breaks this rule:

...

In this situation, the STL, including and global data and static constructors, will be present in both libraries. The runtime behavior of this application is undefined, and in practice crashes are very common. Other possible issues include:

Memory allocated in one library, and freed in the other, causing memory leakage or heap corruption.

Exceptions raised in libfoo.so going uncaught in libbar.so, causing your app to crash.

Buffering of std::cout not working properly.

@AlexF - two questions: 1. always? 2. why? [+can you provide a quote from external doc] — y30, Jul 08 '18 at 14:57
This is a basically a problem of mismatching allocation / deallocation calls. — user7860670, Jul 08 '18 at 15:06
According to the documentation you linked, that situation seems to only apply when you break the One Definition Rule by using the static c++ runtime in two different shared libraries. You can only use the static runtime to build a one library application. — Galik, Jul 08 '18 at 15:16
@Galik - OK, I think you right. "In this situation, the STL, including and global data and static constructors, will be present in both libraries. The runtime behavior of this application is undefined, and in practice crashes are very common. Other possible issues include:..." - I understand it in other way. Thanks — y30, Jul 08 '18 at 15:25
@y30 I don't know definitively but with static linking you get a copy of the code with everything you link it to. So, I would imagine, if you have two different sets of functions to allocate and delete memory they will likely each set up their own internal book-keeping data telling them how to free what they allocated. I would imagine they each have completely independant pools of memory provided by the OS. — Galik, Jul 08 '18 at 16:03
My opinion is that such a question would better go into [Software Engineering](https://softwareengineering.stackexchange.com/) — Basile Starynkevitch, Jul 08 '18 at 16:17

Ivan Rubinson · Answer 1 · 2018-07-08T15:54:57.050

One possible reason why it's considered a mistake is because usually allocation comes with a certain initialization, and deallocation with some destruction logic.

Theory:

The main danger is mismatching initialization / destruction logic.

Lets look at two different STL versions as two different and separate libraries.

Consider this: Each library lets you allocate / deallocate something. Upon resource acquisition, each library does some house-keeping on that thing in its own way, which is encapsulated (read: you don't know about it, and don't need to). What happens if the housekeeping each does is significantly different?

Example:

class Foo
{
private:
    int x;

public:
    Foo() : x(42) {}
};


namespace ModuleA
{
    Foo* createAFoo()
    {
        return new Foo();
    }

    void deleteAFoo(Foo* foo)
    {
        if(foo != nullptr)
            delete foo;
    }
}

namespace ModuleB
{
    std::vector<Foo*> all_foos;

    Foo* createAFoo()
    {
        Foo* foo = new Foo();
        all_foos.push_back(foo);
        return foo;
    }

    void deleteAFoo(Foo* foo)
    {
        if(foo != nullptr)
        {
            std::vector<int>::iterator position = std::find(all_foos.begin(), all_foos.end(), foo);
            if (position != myVector.end())
            {
                myVector.erase(position);
            }
            delete foo;
        }
    }
}

Question: What happens if we do the following?

Foo* foo = ModuleB::createAFoo();
ModuleA::deleteAFoo(foo);

Answer: ModuleB now has a dangling pointer. This can cause all sorts scary and hard to debug of issues down the line. We're also not making all_foos smaller, which may be considered a memory leak (the size of a pointer each time).

Question: What happens if we do the following?

Foo* foo = ModuleA::createAFoo();
ModuleB::deleteAFoo(foo);

Answer: Looks like... nothing bad happens! But what if I removed the if (position != myVector.end()) check? Then we'd have a problem. And an STL might do that in the name of optimization, so...

Good point! but I think that you can give better example. You don't need the 'ModuleA', only the 'ModuleB' twice (from two different libs). [this is exactly same to the case of two static STLs] — y30, Jul 09 '18 at 05:31
"And an STL might do that in the name of optimization, so..." - someone can give a real STL example to this? — y30, Jul 09 '18 at 05:32
@y30 [STL Vectors do not check bounds](https://stackoverflow.com/questions/14015632/c-vector-bounds). — Ivan Rubinson, Jul 10 '18 at 11:17

Dan Albert · Answer 2 · 2018-07-12T22:48:53.227

I wrote that section of the doc. I've had to debug an issue where one of the standard stream objects (cout or similar) was doubly linked into to libraries resulting in two distinct instances of the object. The constructor for the object was run twice, but twice on the same instance of the object. One object was double initialized, the other was uninitialized. When the unconstructed object was used, it would attempt to access some uninitialized memory and crash.

There's really no limit to the strangeness of undefined behavior. It's entirely possible that the bug I'm remembering was unique to the version of the compiler, linker, or loader that we were using at the time.

EDIT: Here's a repro case:

// foo.cpp
#include <stdio.h>

class Foo {
 public:
  Foo() { printf("this: %p\n", this); }
};

Foo foo;

// main.cpp
int main() {
}

Build with:

$ clang++ --version
clang version 7.0.0 (trunk 330210)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
$ clang++ foo.cpp -shared -o libfoo.so
$ clang++ foo.cpp -shared -o libbar.so
$ clang++ main.cpp -L. -lfoo -lbar -rpath  '$ORIGIN'

Both libfoo and libbar will be loaded, and each have their own copy of the object. The constructor will be run twice, but as you can see only one instance of the object has its constructor run; it just runs twice.

$ ./a.out
this: 0x7f9475d48031
this: 0x7f9475d48031

It's can very help if you can give a little example code. I want to analyze it, to understand the exectly cause. Thanks — y30, Jul 10 '18 at 04:52
I had actually spent a while trying to recreate the test case, but was having trouble remembering enough of the details. — Dan Albert, Jul 10 '18 at 19:49
Managed to work out an example of the bad behavior. Updated the answer. — Dan Albert, Jul 12 '18 at 22:49

Why it is a mistake to 'allocate in one library and free in the other'

2 Answers2

Theory:

Example: