10

I'm trying to call some C++ functions through a function pointer table which is exported as a C symbol from a shared object. The code is actually working but Clang's undefined behavior sanitizer (= UBSan) sees the call I made is illegal as follows:

==11410==WARNING: Trying to symbolize code, but external symbolizer is not initialized!
path/to/HelloWorld.cpp:25:13: runtime error: call to function (unknown) through pointer to incorrect function type 'foo::CBar &(*)()'
(./libFoo.so+0x20af0): note: (unknown) defined here

Due to Clang's undefined behavior sanitizer, it is legal to indirectly call a function which returns a reference of a C++ standard class object through a function pointer but it's illegal for a user-defined class. Somebody could you please tell me what's wrong with it?

I've been trying to build the project on Ubuntu 14.04 with Clang-llvm 3.4-1ubuntu3 and CMake 2.8.12.2. To reproduce the phenomenon, please place the following 5 files in the same directory and invoke build.sh. It will create a makefile and build the project, and run the executable.

Foo.h

#ifndef FOO_H
#define FOO_H

#include <string>

//
#define EXPORT __attribute__ ((visibility ("default")))

namespace foo {
    class CBar
    {
        // empty
    };

    class CFoo
    {
    public:
        static CBar& GetUdClass();
        static std::string& GetStdString();
    };

    // function pointer table.
    typedef struct
    {
        CBar& (*GetUdClass)();
        std::string& (*GetStdString)();
    } fptr_t;

    //! function pointer table which is exported.
    extern "C" EXPORT const fptr_t FptrInFoo;
}

#endif

Foo.cpp

#include "Foo.h"
#include <iostream>

using namespace std;

namespace foo
{
    // returns reference of a static user-defined class object.
    CBar& CFoo::GetUdClass()
    {
        cout << "CFoo::GetUdClass" << endl;
        return *(new CBar);
    }

    // returns reference of a static C++ standard class object.
    std::string& CFoo::GetStdString()
    {
        cout << "CFoo::GetStdString" << endl;
        return *(new string("Hello"));
    }

    // function pointer table which is to be dynamically loaded.
    const fptr_t FptrInFoo = {
        CFoo::GetUdClass,
        CFoo::GetStdString,
    };
}

HelloWorld.cpp

#include <iostream>
#include <string>
#include <dirent.h>
#include <dlfcn.h>
#include "Foo.h"

using namespace std;
using namespace foo;

int main()
{
    // Retrieve a shared object.
    const string LibName("./libFoo.so");
    void *pLibHandle = dlopen(LibName.c_str(), RTLD_LAZY);
    if (pLibHandle != 0) {
        cout << endl;
        cout << "Info: " << LibName << " found at " << pLibHandle << endl;
        // Try to bind a function pointer table:
        const string SymName("FptrInFoo");
        const fptr_t *DynLoadedFptr = static_cast<const fptr_t *>(dlsym(pLibHandle, SymName.c_str()));
        if (DynLoadedFptr != 0) {
            cout << "Info: " << SymName << " found at " << DynLoadedFptr << endl;
            cout << endl;
            // Do something with the functions in the function table pointer.
            DynLoadedFptr->GetUdClass();    // Q1. Why Clang UBSan find this is illegal??
            DynLoadedFptr->GetStdString();  // Q2. And why is this legal??
        } else {
            cout << "Warning: Not found symbol" << endl;
            cout << dlerror() << endl;
        }
    } else {
        cout << "Warning: Not found library" << endl;
        cout << dlerror() << endl;
    }
    cout << endl;
    return 0;
}

CMakeLists.txt

project (test)

if(COMMAND cmake_policy)
      cmake_policy(SET CMP0003 NEW)
endif(COMMAND cmake_policy)

set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,-rpath,$ORIGIN")

add_library(Foo SHARED Foo.cpp)

add_executable(HelloWorld HelloWorld.cpp)
target_link_libraries (HelloWorld dl)

build.sh

#!/bin/bash

# 1. create a build directory.
if [ -d _build ]; then
    rm -rf _build
fi
mkdir _build
cd _build

# 2. generate a makefile.
CC=clang CXX=clang++ CXXFLAGS="-fvisibility=hidden -fsanitize=undefined -O0 -g3" cmake ..

# 3. build.
make

# 4. and run the executable.
./HelloWorld

I've been trying to find a clue to dig into the issue and realized the issue was caught by "function" option of the sanitizer (-fsanitize=function) but it's not so much documented. I'd appreciate if you guys could give me a reasonable explanation for such a runtime error message which looks like coming from another planet. Thanks.

What was Clang pointing out as "unknown" in the output?

Below is the output from addr2line to check what was "unknown" for the sanitizer:

$ addr2line -Cfe _build/libFoo.so 0x20af0
foo::CFoo::GetUdClass()
path/to/Foo.cpp:12

Hmm, it really looks like the function I was expecting to call for me. Can you guess how did it look different for Clang?

yugr
  • 19,769
  • 3
  • 51
  • 96
Doofah
  • 384
  • 3
  • 12
  • `FptrInFoo` is not a function pointer in namespace `foo`, it is a global! The simple reason is that it is declared as `extern "C"`. Try to declare one in a different namespace and you will see. What I'm wondering now is whether the definition will create an object inside the namespace (and with static linkage, because it's a constant) or if it defines the external global. BTW: Why are you using "typedef struct ...", you can't use that code in C anyway. – Ulrich Eckhardt Jan 18 '15 at 16:13

1 Answers1

8

CBar's typeinfo needs to have default visibility for the function's type be considered the same by Clang on Linux across the executable and the dynamic library; change Foo.h to:

  class EXPORT CBar
  {
      ...
  }
Moss
  • 6,002
  • 1
  • 35
  • 40
  • Nasty. I'wonder if the compiler could be improved to provide a better diagnostic. Good catch Stephan! – Ulrich Eckhardt Jan 19 '15 at 23:04
  • Since I lost a couple of hours on this one, I may add that one should be careful to use `__attribute__ ((visibility ("default")))` both in the library and in the executable calling it. – Arnaud Jan 17 '17 at 22:36
  • This does not seem to suffice if `CBar` is not polymorphic. I have the problem on `void(SomeEnum, const SomeStruct&, const SomeClass&)`. None of these types are polymorphic, so I don't see where the RTTI for any of these types would be emitted by the compiler. It seems to me that this runs afoul of duplicated type_infos the same way inline exception classes cause headaches: https://marcmutz.wordpress.com/2010/08/04/fun-with-exceptions/ – Marc Mutz - mmutz Feb 21 '17 at 21:41
  • The RTTI should be emitted wherever needed, weakly if necessary. And trying with a recent Clang on Linux, that seems to work fine for me: – Stephan Bergmann Feb 26 '17 at 12:26