3

I want to Compiling / linking the whole stdlibc+ into a so to independence with C++ support. So I make an empty so file with only contains declare extern "C" void * __dso_handle = 0 and export all symbols link it with

-Wl,--whole-archive
-Wl,-Bstatic
stdc++

And set the link flags -nostdlib everything works as expected except I can't unmap it from the memory by call dlclose() even I immediately call dlclose() after dlopen(). It still existed when I use command cat /proc/pid/maps.

Then I use LD_DEBUG=binding to load the dso file. I found it have a lot of symbols binding to libc.so.6, libstdc++.so.6. I guess when system load the dso file into memory than it change the default function pointer point into the dso function that static linked stdc+ into the dso file.

My question is: Is it possible to unmap the dso file from memory by dlclose()? Maybe system not support rebind the function pointer to the default address so it can't support unload the library from memory by dlclose(). Am I right? Or is there any way I can do?

I also tried without -nostdlib and remove the code extern "C" void * __dso_handle = 0 so the so is totally empty but still can't unload it from memory by use dlclose. I have no idea now.

Update Apr 25, 2018

Right now use gold and -Wl, – no-GNU-unique can success unload the library from memory but it will cause the application crash when exit after call dlclose. The stack is below:

#0 0x00007ffff0245bf0 in ?? () 
#1 0x00007ffff70e8a69 in __run_exit_handlers () from /lib64/libc.so.6 
#2 0x00007ffff70e8ab5 in exit () from /lib64/libc.so.6 
#3 0x00007ffff70d1c0c in __libc_start_main () from /lib64/libc.so.6 
#4 0x000000000047a1e5 in _start ()

Don't know why. I am going to rebuild GCC with – disable-GNU-unique-object to try again. And I made a sample repo to reproduce this issue here.

The crash also exist with the GCC rebuilded with – disable-GNU-unique-object, even don't use gold and -Wl, – no-GNU-unique to compile and link it.

Update Apr 26, 2018 here. And if you have interesting please see my repo here that contains the source of crashed and fixed dso project that you can reproduce and compare the changes and the fixed details about the compiler and the linker options used. Will keep update if any new issue happens. At last Thank you very much @yugr:).

#ifndef _ICXXABI_H
    #define _ICXXABI_H

    #define ATEXIT_MAX_FUNCS    128

    #ifdef __cplusplus
    extern "C" {
    #endif

typedef unsigned uarch_t;

struct atexit_func_entry_t
{
    /*
    * Each member is at least 4 bytes large. Such that each entry is 12bytes.
    * 128 * 12 = 1.5KB exact.
    **/
    void (*destructor_func)(void *);
    void *obj_ptr;
    void *dso_handle;
};

atexit_func_entry_t __atexit_funcs[ATEXIT_MAX_FUNCS];
uarch_t __atexit_func_count = 0;

void *__dso_handle = 0; //Attention! Optimally, you should remove the '= 0' part and define this in your asm script.

int __cxa_atexit(void (*f)(void *), void *objptr, void *dso)
{
    if (__atexit_func_count >= ATEXIT_MAX_FUNCS) {return -1;};
    __atexit_funcs[__atexit_func_count].destructor_func = f;
    __atexit_funcs[__atexit_func_count].obj_ptr = objptr;
    __atexit_funcs[__atexit_func_count].dso_handle = dso;
    __atexit_func_count++;
    return 0; /*I would prefer if functions returned 1 on success, but the ABI says...*/
};

void __cxa_finalize(void *f)
{
    uarch_t i = __atexit_func_count;
    if (!f)
    {
        /*
        * According to the Itanium C++ ABI, if __cxa_finalize is called without a
        * function ptr, then it means that we should destroy EVERYTHING MUAHAHAHA!!
        *
        * TODO:
        * Note well, however, that deleting a function from here that contains a __dso_handle
        * means that one link to a shared object file has been terminated. In other words,
        * We should monitor this list (optional, of course), since it tells us how many links to 
        * an object file exist at runtime in a particular application. This can be used to tell 
        * when a shared object is no longer in use. It is one of many methods, however.
        **/
        //You may insert a prinf() here to tell you whether or not the function gets called. Testing
        //is CRITICAL!
        while (i--)
        {
            if (__atexit_funcs[i].destructor_func)
            {
                /* ^^^ That if statement is a safeguard...
                * To make sure we don't call any entries that have already been called and unset at runtime.
                * Those will contain a value of 0, and calling a function with value 0
                * will cause undefined behaviour. Remember that linear address 0, 
                * in a non-virtual address space (physical) contains the IVT and BDA.
                *
                * In a virtual environment, the kernel will receive a page fault, and then probably
                * map in some trash, or a blank page, or something stupid like that.
                * This will result in the processor executing trash, and...we don't want that.
                **/
                (*__atexit_funcs[i].destructor_func)(__atexit_funcs[i].obj_ptr);
            };
        };
        return;
    };

    while (i--)
    {
        /*
        * The ABI states that multiple calls to the __cxa_finalize(destructor_func_ptr) function
        * should not destroy objects multiple times. Only one call is needed to eliminate multiple
        * entries with the same address.
        *
        * FIXME:
        * This presents the obvious problem: all destructors must be stored in the order they
        * were placed in the list. I.e: the last initialized object's destructor must be first
        * in the list of destructors to be called. But removing a destructor from the list at runtime
        * creates holes in the table with unfilled entries.
        * Remember that the insertion algorithm in __cxa_atexit simply inserts the next destructor
        * at the end of the table. So, we have holes with our current algorithm
        * This function should be modified to move all the destructors above the one currently
        * being called and removed one place down in the list, so as to cover up the hole.
        * Otherwise, whenever a destructor is called and removed, an entire space in the table is wasted.
        **/
        if (__atexit_funcs[i].destructor_func == f)
        {
            /* 
            * Note that in the next line, not every destructor function is a class destructor.
            * It is perfectly legal to register a non class destructor function as a simple cleanup
            * function to be called on program termination, in which case, it would not NEED an
            * object This pointer. A smart programmer may even take advantage of this and register
            * a C function in the table with the address of some structure containing data about
            * what to clean up on exit.
            * In the case of a function that takes no arguments, it will simply be ignore within the
            * function itself. No worries.
            **/
            (*__atexit_funcs[i].destructor_func)(__atexit_funcs[i].obj_ptr);
            __atexit_funcs[i].destructor_func = 0;

            /*
            * Notice that we didn't decrement __atexit_func_count: this is because this algorithm
            * requires patching to deal with the FIXME outlined above.
            **/
        };
    };
};

    #ifdef __cplusplus
    };
    #endif

#endif

Add a new implement extract from Chrome seems thread safe put it here for backup solution. Not sure which one is the best.

/*  $OpenBSD: atexit.c,v 1.14 2007/09/05 20:47:47 chl Exp $ */
/*
 * Copyright (c) 2002 Daniel Hartmeier
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 *    - Redistributions of source code must retain the above copyright
 *      notice, this list of conditions and the following disclaimer.
 *    - Redistributions in binary form must reproduce the above
 *      copyright notice, this list of conditions and the following
 *      disclaimer in the documentation and/or other materials provided
 *      with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
 * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
 *
 */
#include <sys/cdefs.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>

extern "C"
{

static pthread_mutex_t   gAtExitLock = PTHREAD_MUTEX_INITIALIZER;

void  _thread_atexit_lock( void )
{
  pthread_mutex_lock( &gAtExitLock );
}

void _thread_atexit_unlock( void )
{
  pthread_mutex_unlock( &gAtExitLock );
}

#define _ATEXIT_LOCK() _thread_atexit_lock()
#define _ATEXIT_UNLOCK() _thread_atexit_unlock()

struct atexit {
    struct atexit *next;        /* next in list */
    int ind;            /* next index in this table */
    int max;            /* max entries >= ATEXIT_SIZE */
    struct atexit_fn {
        void (*cxa_func)(void *);
        void *fn_arg;       /* argument for CXA callback */
        void *fn_dso;       /* shared module handle */
    } fns[1];           /* the table itself */
};
void *__dso_handle = 0;
struct atexit *__atexit;
/*
 * TODO: Read this before upstreaming:
 *
 * As of Apr 2014 there is a bug regaring function type detection logic in
 * Free/Open/NetBSD implementations of __cxa_finalize().
 *
 * What it is about:
 * First of all there are two kind of atexit handlers:
 *  1) void handler(void) - this is the regular type
 *     available for to user via atexit(.) function call.
 *
 *  2) void internal_handler(void*) - this is the type
 *     __cxa_atexit() function expects. This handler is used
 *     by C++ compiler to register static destructor calls.
 *     Note that calling this function as the handler of type (1)
 *     results in incorrect this pointer in static d-tors.
 *
 * What is wrong with BSD implementations:
 *
 *  They use dso argument to identify the handler type. The problem
 *  with it is dso is also used to identify the handlers associated
 *  with particular dynamic library and allow __cxa_finalize to call correct
 *  set of functions on dlclose(). And it cannot identify both.
 *
 * What is correct way to identify function type?
 *
 *  Consider this:
 *  1. __cxa_finalize and __cxa_atexit are part of libc and do not have access to hidden
 *     &__dso_handle.
 *  2. __cxa_atexit has only 3 arguments: function pointer, function argument, dso.
 *     none of them can be reliably used to pass information about handler type.
 *  3. following http://www.codesourcery.com/cxx-abi/abi.html#dso-dtor (3.3.5.3 - B)
 *     translation of user atexit -> __cxa_atexit(f, NULL, NULL) results in crashes
 *     on exit() after dlclose() of a library with an atexit() call.
 *
 *  One way to resolve this is to always call second form of handler, which will
 *  result in storing unused argument in register/stack depending on architecture
 *  and should not present any problems.
 *
 *  Another way is to make them dso-local in one way or the other.
 */
/*
 * Function pointers are stored in a linked list of pages. The list
 * is initially empty, and pages are allocated on demand. The first
 * function pointer in the first allocated page (the last one in
 * the linked list) was reserved for the cleanup function.
 *
 * Outside the following functions, all pages are mprotect()'ed
 * to prevent unintentional/malicious corruption.
 */
/*
 * Register a function to be performed at exit or when a shared object
 * with the given dso handle is unloaded dynamically.  Also used as
 * the backend for atexit().  For more info on this API, see:
 *
 *  http://www.codesourcery.com/cxx-abi/abi.html#dso-dtor
 */
int
__cxa_atexit(void (*func)(void *), void *arg, void *dso)
{
    struct atexit *p = __atexit;
    struct atexit::atexit_fn *fnp;
    size_t pgsize = getpagesize();
    int ret = -1;
    if (pgsize < sizeof(*p))
        return (-1);
    _ATEXIT_LOCK();
    p = __atexit;
    if (p != NULL) {
        if (p->ind + 1 >= p->max)
            p = NULL;
        else if (mprotect(p, pgsize, PROT_READ | PROT_WRITE))
            goto unlock;
    }
    if (p == NULL) {
        p = (struct atexit *)mmap(NULL, pgsize, PROT_READ | PROT_WRITE,
            MAP_ANON | MAP_PRIVATE, -1, 0);
        if (p == MAP_FAILED)
            goto unlock;
        if (__atexit == NULL) {
            memset(&p->fns[0], 0, sizeof(p->fns[0]));
            p->ind = 1;
        } else
            p->ind = 0;
        p->max = (pgsize - ((char *)&p->fns[0] - (char *)p)) /
            sizeof(p->fns[0]);
        p->next = __atexit;
        __atexit = p;
    }
    fnp = &p->fns[p->ind++];
    fnp->cxa_func = func;
    fnp->fn_arg = arg;
    fnp->fn_dso = dso;
    if (mprotect(p, pgsize, PROT_READ))
        goto unlock;
    ret = 0;
unlock:
    _ATEXIT_UNLOCK();
    return (ret);
}
/*
 * Call all handlers registered with __cxa_atexit() for the shared
 * object owning 'dso'.
 * Note: if 'dso' is NULL, then all remaining handlers are called.
 */
void
__cxa_finalize(void *dso)
{
    struct atexit *p, *q, *original_atexit;
    struct atexit::atexit_fn fn;
    int n, pgsize = getpagesize(), original_ind;
    static int call_depth;
    _ATEXIT_LOCK();
    call_depth++;
    p = original_atexit = __atexit;
    n = original_ind = p != NULL ? p->ind : 0;
    while (p != NULL) {
        if (p->fns[n].cxa_func != NULL /* not called */
                && (dso == NULL || dso == p->fns[n].fn_dso)) { /* correct DSO */
            /*
             * Mark handler as having been already called to avoid
             * dupes and loops, then call the appropriate function.
             */
            fn = p->fns[n];
            if (mprotect(p, pgsize, PROT_READ | PROT_WRITE) == 0) {
                p->fns[n].cxa_func = NULL;
                mprotect(p, pgsize, PROT_READ);
            }
            _ATEXIT_UNLOCK();
            (*fn.cxa_func)(fn.fn_arg);
            _ATEXIT_LOCK();
            // check for new atexit handlers
            if ((__atexit->ind != original_ind) || (__atexit != original_atexit)) {
                // need to restart now to preserve correct
                // call order - LIFO
                p = original_atexit = __atexit;
                n = original_ind = p->ind;
                continue;
            }
        }
        if (n == 0) {
            p = p->next;
            n = p != NULL ? p->ind : 0;
        } else {
            --n;
        }
    }
    --call_depth;
    /*
     * If called via exit(), unmap the pages since we have now run
     * all the handlers.  We defer this until calldepth == 0 so that
     * we don't unmap things prematurely if called recursively.
     */
    if (dso == NULL && call_depth == 0) {
        for (p = __atexit; p != NULL; ) {
            q = p;
            p = p->next;
            munmap(q, pgsize);
        }
        __atexit = NULL;
    }
    _ATEXIT_UNLOCK();
}
/*
 * Register the cleanup function
 */
void
__atexit_register_cleanup(void (*func)(void))
{
        struct atexit *p;
        size_t pgsize = getpagesize();
        if (pgsize < sizeof(*p))
                return;
        _ATEXIT_LOCK();
        p = __atexit;
        while (p != NULL && p->next != NULL)
                p = p->next;
        if (p == NULL) {
                p = (struct atexit *)mmap(NULL, pgsize, PROT_READ | PROT_WRITE,
                    MAP_ANON | MAP_PRIVATE, -1, 0);
                if (p == MAP_FAILED)
                        goto unlock;
                p->ind = 1;
                p->max = (pgsize - ((char *)&p->fns[0] - (char *)p)) /
                    sizeof(p->fns[0]);
                p->next = NULL;
                __atexit = p;
        } else {
                if (mprotect(p, pgsize, PROT_READ | PROT_WRITE))
                        goto unlock;
        }
        p->fns[0].cxa_func = (void (*)(void*))func;
        p->fns[0].fn_arg = NULL;
        p->fns[0].fn_dso = NULL;
        mprotect(p, pgsize, PROT_READ);
unlock:
        _ATEXIT_UNLOCK();
}

}

The so that use gold link can't wrap any C++ function e.g. memcpy into it like below and link with --wrap=memcpy

#include <string.h>

/* some systems do not have newest memcpy@@GLIBC_2.14 - stay with old good one */
asm (".symver memcpy, memcpy@GLIBC_2.2.5");

void *__wrap_memcpy(void *dest, const void *src, size_t n)
{
    return memcpy(dest, src, n);
}

This works with ld-bfd, but not with gold. The reason is that gold wraps the symbols for all versions, so the call to memcpy becomes a call to __wrap_memcpy which results in an infinite recursion. For more details please see here. But on my side it not work even I use __wrap_memcpy and __real_memcpy it still results in an infinite secursion. Finally fixed it by implement __wrap_memcpy by memmove.

#include <string.h>

#ifdef __cplusplus
extern "C" {
#endif

    void *__wrap_memcpy(void *dest, const void *src, size_t n)
    {
        return memmove(dest, src, n);
    }

#ifdef __cplusplus
}
#endif

So it contains wrapped memcpy and independent with stdc+ library.

Memory leak issue in static link stdc+ library more detail is here.

double-beep
  • 5,031
  • 17
  • 33
  • 41
TQ.Liu
  • 81
  • 6

1 Answers1

0

Most probly libstdc++ contains some STB_GNU_UNIQUE symbol which, if used, prevents containing library from being unloaded (see e.g. here or here for details and examples).

There are several ways to resolve this depending on your use-case e.g.

  • compile with -fvisibility=hidden to hide all libstdc++ symbols and then manually annotate symbols which must be exported from your library with __attribute__((visibility("default"))) (you can also use linker scripts for the same purpose but that does not allow compiler to optimize code as well)
  • use gold and compile with -Wl,-no-gnu-unique
  • build custom GCC with --disable-gnu-unique-object and use that to compiler your library
yugr
  • 19,769
  • 3
  • 51
  • 96
  • Thanks for the response. seems -Wl,-no-gnu-unique can't work for me. So I am try to compile with --version-script to open export needed symbol. to have a try. – TQ.Liu Apr 24 '18 at 16:17
  • @TQ.Liu I suggest to use `-fvisibility=hidden` instead of version script. It's generally easier, more widely used and allows compiler to optimize code better. – yugr Apr 24 '18 at 16:45
  • I tried with --version-script but I can't export symbols that start with `std::`. the script is as below:`{ global: *basic_istream*; *basic_ostream*; std::*; *cxa*; local: *; };` then I tried with `-fvisibility=hidden` but it still export every symbols and can't unload from memory. – TQ.Liu Apr 24 '18 at 17:23
  • @TQ.Liu I suggest to add gcc commands to the question, otherwise it's impossible to diagnose. – yugr Apr 24 '18 at 17:40
  • I used cmake3 to build the project. and the build script is `set(COMPILE_PARAM COMPILE_FLAGS "${ARCH_FLAGS} -fPIC -msse -msse2 -msse3 -Wl,-z,defs") set(LINK_PARAM LINK_FLAGS "${ARCH_FLAGS} ${LINK_FLAGS} -nostdlib -Wl,--no-undefined -Wl,--version-script=${CMAKE_CURRENT_SOURCE_DIR}/test.expmap")` – TQ.Liu Apr 24 '18 at 17:47
  • you are right after use gold and compile with -Wl,-no-gnu-unique than it can unmap from memory by dlclose now. Thank you very much. By the way is other modules that link to this so file also need use gold and compile with -Wl,-no-gnu-unique? – TQ.Liu Apr 25 '18 at 00:20
  • @TQ.Liu No, this should not be necessary. – yugr Apr 25 '18 at 02:32
  • Hi @yugr, Thank you for the reply. Through the dlclose can work by using gold. But the application make crash at exit after dlclose the dso. the stack is :` #0 0x00007ffff0245bf0 in ?? () #1 0x00007ffff70e8a69 in __run_exit_handlers () from /lib64/libc.so.6 #2 0x00007ffff70e8ab5 in exit () from /lib64/libc.so.6 #3 0x00007ffff70d1c0c in __libc_start_main () from /lib64/libc.so.6 #4 0x000000000047a1e5 in _start () ` is there any option I missed when change ld to ld.gold? – TQ.Liu Apr 25 '18 at 04:46
  • @TQ.Liu Sorry, can you post link to repro once again? – yugr Apr 25 '18 at 05:15
  • Hi @yugr. sure the repo is [here](https://github.com/blknit/linux_issue) you need make it with cmake3 after the build it generate a executable file testbin and a libso.so. the testbin just do dlopen the libso.so then dlclose it after dlclose the crash happens. it seems that after the dlclose it make some symbol missing – TQ.Liu Apr 25 '18 at 13:36
  • @TQ.Liu Unfortunately I couldn't repro. I uploaded [some fixes](https://github.com/yugr/linux_issue/commit/691f7f9adc3180a3b18e12b205f8bbfc4a97ec76), maybe they'll be useful. – yugr Apr 26 '18 at 07:16