43

We've recently been asked to ship a Linux version of one of our libraries, previously we've developed under Linux and shipped for Windows where deploying libraries is generally a lot easier. The problem we've hit upon is in stripping the exported symbols down to only those in the exposed interface. There are three good reasons for wanting to do this

  • To protect the proprietary aspects of our technology from exposure through the exported symbols.
  • To prevent users having problems with conflicting symbol names.
  • To speed up the loading of the library (at least so I'm told).

Taking a simple example then:

test.cpp

#include <cmath>

float private_function(float f)
{
    return std::abs(f);
}

extern "C" float public_function(float f)
{
    return private_function(f);
}

compiled with (g++ 4.3.2, ld 2.18.93.20081009)

g++ -shared -o libtest.so test.cpp -s

and inspecting the symbols with

nm -DC libtest.so

gives

         w _Jv_RegisterClasses
0000047c T private_function(float)
000004ba W std::abs(float)
0000200c A __bss_start
         w __cxa_finalize
         w __gmon_start__
0000200c A _edata
00002014 A _end
00000508 T _fini
00000358 T _init
0000049b T public_function

obviously inadequate. So next we redeclare the public function as

extern "C" float __attribute__ ((visibility ("default"))) 
    public_function(float f)

and compile with

g++ -shared -o libtest.so test.cpp -s -fvisibility=hidden

which gives

         w _Jv_RegisterClasses
0000047a W std::abs(float)
0000200c A __bss_start
         w __cxa_finalize
         w __gmon_start__
0000200c A _edata
00002014 A _end
000004c8 T _fini
00000320 T _init
0000045b T public_function

which is good, except that std::abs is exposed. More problematic is when we start linking in other (static) libraries outside of our control, all of the symbols we use from those libraries get exported. In addition, when we start using STL containers:

#include <vector>
struct private_struct
{
    float f;
};

void other_private_function()
{
    std::vector<private_struct> v;
}

we end up with many additional exports from the C++ library

00000b30 W __gnu_cxx::new_allocator<private_struct>::deallocate(private_struct*, unsigned int)
00000abe W __gnu_cxx::new_allocator<private_struct>::new_allocator()
00000a90 W __gnu_cxx::new_allocator<private_struct>::~new_allocator()
00000ac4 W std::allocator<private_struct>::allocator()
00000a96 W std::allocator<private_struct>::~allocator()
00000ad8 W std::_Vector_base<private_struct, std::allocator<private_struct> >::_Vector_impl::_Vector_impl()
00000aaa W std::_Vector_base<private_struct, std::allocator<private_struct> >::_Vector_impl::~_Vector_impl()
00000b44 W std::_Vector_base<private_struct, std::allocator<private_struct> >::_M_deallocate(private_struct*, unsigned int)
00000a68 W std::_Vector_base<private_struct, std::allocator<private_struct> >::_M_get_Tp_allocator()
00000b08 W std::_Vector_base<private_struct, std::allocator<private_struct> >::_Vector_base()
00000b6e W std::_Vector_base<private_struct, std::allocator<private_struct> >::~_Vector_base()
00000b1c W std::vector<private_struct, std::allocator<private_struct> >::vector()
00000bb2 W std::vector<private_struct, std::allocator<private_struct> >::~vector()

NB: With optimisations on you'll need to make sure the vector is actually used so the compiler doesn't optimise the unused symbols out.

I believe my colleague has managed to construct an ad-hoc solution involving version files and modifying the STL headers (!) that appears to work, but I would like to ask:

Is there a clean way to strip all unnecessary symbols (IE ones that are not part of the exposed library functionality) from a linux shared library? I've tried quite a lot of options to both g++ and ld with little success so I'd prefer answers that are known to work rather than believed to.

In particular:

  • Symbols from (closed-source) static libraries are not exported.
  • Symbols from the standard library are not exported.
  • Non-public symbols from the object files are not exported.

Our exported interface is C.

I'm aware of the other similar questions on SO:

but have had little success with the answers.

Community
  • 1
  • 1
Adam Bowen
  • 10,820
  • 6
  • 36
  • 41
  • 1
    On static linking of system libraries: It's illegal for you to do it. That is, since [(e)](http://www.eglibc.org/)[GLIBC](http://www.gnu.org/software/libc/) is licensed under [LGPL](http://opensource.org/licenses/LGPL-3.0) and since that license applies to all code using it except if linked dynamically, by linking statically you make your code covered by LGPL and are required to provide sources (to anybody you gave binary to and they ask for sources). This does not apply to libgcc and libstdc++, which specifically don't apply to any code using public API only, no matter how linked. – Jan Hudec Jan 03 '12 at 06:44
  • I'm aware of this, and wasn't referring to symbols from glibc, all the symbols above are generated by template instantiation from the C++ standard library and are, by necessity, generated in my object files (since the template instantiations can't be in the library!). – Adam Bowen Jan 03 '12 at 08:45

6 Answers6

9

So the solution we have for now is as follows:

test.cpp

#include <cmath>
#include <vector>
#include <typeinfo>

struct private_struct
{
    float f;
};

float private_function(float f)
{
    return std::abs(f);
}

void other_private_function()
{
    std::vector<private_struct> f(1);
}

extern "C" void __attribute__ ((visibility ("default"))) public_function2()
{
    other_private_function();
}

extern "C" float __attribute__ ((visibility ("default"))) public_function1(float f)
{
    return private_function(f);
}

exports.version

LIBTEST 
{
global:
    public*;
local:
    *;
};

compiled with

g++ -shared test.cpp -o libtest.so -fvisibility=hidden -fvisibility-inlines-hidden -s -Wl,--version-script=exports.version

gives

00000000 A LIBTEST
         w _Jv_RegisterClasses
         U _Unwind_Resume
         U std::__throw_bad_alloc()
         U operator delete(void*)
         U operator new(unsigned int)
         w __cxa_finalize
         w __gmon_start__
         U __gxx_personality_v0
000005db T public_function1
00000676 T public_function2

Which is fairly close to what we're looking for. There are a few gotchas though:

  • We have to ensure we don't use the "exported" prefix (in this simple example "public", but obviously something more useful in our case) in the internal code.
  • Many symbol names still end up in the string table, which appears to be down to RTTI, -fno-rtti makes them go away in my simple tests, but is a rather nuclear solution.

I'm happy to accept any better solutions anyone comes up with!

Adam Bowen
  • 10,820
  • 6
  • 36
  • 41
  • I'm accepting our final solution, as it best fits our needs, but for the benefit of anyone else in the same situation would like to add that the other answers are all perfectly viable solutions if your situation differ from ours a little! – Adam Bowen Feb 17 '10 at 19:07
  • Does this method also hide the symbols from the other static libraries? I'm running into the exact same problem with lots of cruft from various external dependencies being exported. – Soo Wei Tan Jun 07 '10 at 23:49
  • It appears to, yes. The only symbols were our exports and some symbols we were linking to in the C/C++ libraries. – Adam Bowen Jun 08 '10 at 07:56
8

Your use of the default visibility attribute and -fvisibility=hidden should be augmented with -fvisibility-inlines-hidden.

You should also forget about trying to hide stdlib exports, see this GCC bug for why.

Also, if you have all of your public symbols in a specific headers you can wrap them in #pragma GCC visibility push(default) and #pragma GCC visibility pop instead of using attributes. Though if you are creating a cross platform library, take a look at Controlling Exported Symbols of Shared Libraries for a technique to unify your windows DLL and Linux DSO export strategy.

joshperry
  • 41,167
  • 16
  • 88
  • 103
  • Thanks for taking the time to answer, the links made for interesting reading. 1. We expose a pure C interface (for compatibility mostly), why should we expose the details of our implementation? Just because we *could* share RTTI, etc. across library boundaries doesn't mean we *will*. 2. In my simple examples -fvisibility-inlines-hidden made no difference, I don't believe it will affect our interface at all (or resolve our issues) but it may be useful in the future. 3. Unfortunately, the referenced article doesn't appear to offer solutions to any of our problems (beyond what we have). – Adam Bowen Jan 18 '10 at 21:39
  • I see what you mean by private_struct being in the vector instantiation that gets exported. What change is your coworker making to the headers that make these go away? – joshperry Jan 18 '10 at 23:03
7

Just to note that Ulrich Drepper wrote an essay regarding (all?) aspects of writing shared libraries for Linux/Unix, which covers control of exported symbols amongst many other topics.

This was very handy in making it clear how to export only functions on a whitelist from a shared lib.

grrussel
  • 7,209
  • 8
  • 51
  • 71
6

If you wrap up your private part in an anonymous namespace then neither std::abs nor private_function can be seen in the symbol table:

namespace{
#include<cmath>
  float private_function(float f)
  {
    return std::abs(f);
  }
}
extern "C" float public_function(float f)
{
        return private_function(f);
}

compiling (g++ 4.3.3):

g++ -shared -o libtest.so test.cpp -s

inspecting:

# nm -DC libtest.so
         w _Jv_RegisterClasses
0000200c A __bss_start
         w __cxa_finalize
         w __gmon_start__
0000200c A _edata
00002014 A _end
000004a8 T _fini
000002f4 T _init
00000445 T public_function
catwalk
  • 6,340
  • 25
  • 16
  • 1
    Whilst it works for a simple example and is a good way of hiding local constructs, private namespaces won't scale to the full project. – Adam Bowen Jan 19 '10 at 12:14
3

In general, across multiple Linux and Unix systems, the answer here is that there is no answer here at link time. it's fairly fundamental to how ld.so works.

This leads to some rather labor-intensive alternatives. For example, we rename STL to live in _STL instead of std to avoid conflicts over STL, and we use namespaces high, low, and in-between to keep our symbols away from possible conflicts with other people's symbols.

Here's a solution you won't love:

  1. Create a small .so with only your exposed API it.
  2. Have it open the real implementation with dlopen, and link with dlsym.

So long as you don't use RTLD_GLOBAL, you now have complete insulation if not particular secrecy .. -Bsymbolic might also be desirable.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
0

Actually, in the ELF structure there are 2 symbol tables: "symtab" and "dynsym" -> see this: Hiding symbol names in library

vtomazzi
  • 11
  • 2
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. Please add some explanation and quotes. – Anna Jan 19 '20 at 23:08