8

Hello people I hope you an help me out with this problem:

I am currently implementing an interpreter for a scripting language. The language needs a native call interface to C functions, like java has JNI. My problem is, that i want to call the original C functions without writing a wrapper function, which converts the call stack of my scripting language into the C call stack. This means, that I need a way, to generate argument lists of C functions at runtime. Example:

void a(int a, int b) {
    printf("function a called %d", a + b);
}

void b(double a, int b, double c) {
    printf("function b called %f", a * b + c);
}

interpreter.registerNativeFunction("a", a);
interpreter.registerNativeFunction("b", b);

The interpreter should be able to call the functions, with only knowing the function prototypes of my scripting language: native void a(int a, int b); and native void b(double a, int b, double c);

Is there any way to generate a C function call stack in C++, or do I have to use assembler for this task. Assembler is a problem, because the interpreter should run on almost any platform.

Edit: The solution is to use libffi, a library, which handles the call stack creation for many different platforms and operating systems. libffi is also used by some prominent language implementations like cpython and openjdk.

Edit: @MatsPetersson Somewhere in my code I have a method like:

void CInterpreter::CallNativeFunction(string name, vector<IValue> arguments, IReturnReference ret) {
    // Call here correct native C function.
    // this.nativeFunctions is a map which contains the function pointers.
}

Edit: Thanks for all your help! I will stay with libffi, and test it on all required platforms.

ruabmbua
  • 393
  • 1
  • 3
  • 13
  • 4
    Have a look at libffi perhaps? – Kerrek SB Oct 26 '14 at 16:39
  • Why have you tagged this c++ – Ed Heal Oct 26 '14 at 16:39
  • yes, but it generally requires you to rely on either compiler extensions or implementation defined behavior. I have a solution for this on my other laptop that I'll post when I get done at the pub. – Captain Obvlious Oct 26 '14 at 16:39
  • @EdHeal I tagged it C++, because I am using C++11 in my project. My scripting language can only call C functions, because the C function call stack is more stable than the C++ ones. – ruabmbua Oct 26 '14 at 16:42
  • @CaptainObvlious I do not know any compiler specific option, which will let me do that. But compiler extensions are not helpful, because I have to use 2 different compilers. Gcc and clang. – ruabmbua Oct 26 '14 at 16:44
  • So a C++ call stack is unstable?! – Ed Heal Oct 26 '14 at 16:45
  • @KerrekSB libffi actually sounds really good. If I do not find any standard C++ solution, I will take it. – ruabmbua Oct 26 '14 at 16:46
  • Clang and g++ should be call-level compatible, and nearly all extensions are compatible between these compilers. I use both quite a bit both at home and at work. – Mats Petersson Oct 26 '14 at 16:49
  • @EdHeal I think so, but it does not matter. I do not have to mess around with C++ functions, because I do not need any C++ features for my native bindings. C functions are enough. – ruabmbua Oct 26 '14 at 16:51
  • Interesting that you have evidence of unstable call stacks in C++ – Ed Heal Oct 26 '14 at 17:00
  • @EdHeal Then please tell me better. Honestly I do not know it. I just thought it is like that, but like I said it does not matter for me. Why does every language include a C native calling interface, but no C++ one? – ruabmbua Oct 26 '14 at 17:04
  • Have you got an example of how you plan on using this? How does your native language tranfer paramters to the "native" interface? (So show the flow of code and data in a call to a native function - you don't need to produce all the code, just explain how it's intended to work). – Mats Petersson Oct 26 '14 at 17:06
  • Also, are you planning to allow the user-code to arbitrarily add functions at a later stage (e.g. using shared libraries), or can the native functions be compiled in when you build the interpreter? – Mats Petersson Oct 26 '14 at 17:07
  • @MatsPetersson My scripting language is not a general purpose scripting language. It is a special purpose language for defining non linear and linear game stories, and events. The interpreter is linked into the game, and the game creates a new interpreter. Then it defines some functions and registers them, to be able to talk with the script. – ruabmbua Oct 26 '14 at 17:14
  • @MatsPetersson Somewhere in my code I have a method like: void CInterpreter::CallNativeFunction(string name, vector arguments, IReturnReference ret) { // Call here correct native C function. // this.nativeFunctions is a map which contains the function pointers. } – ruabmbua Oct 26 '14 at 17:18
  • In that case, I'd simply make them a constant prototype (e.g. an array of `lValue`), and just use a map between name and function. – Mats Petersson Oct 26 '14 at 17:22
  • I was able to shell into my laptop and after looking at my old solutions and your requirements I agree with the others and libffi is likely to be your best bet. – Captain Obvlious Oct 26 '14 at 17:24
  • @MatsPetersson I already did it this way, but a college of mine said it would be awesome the other way. Challenge accepted ^^. – ruabmbua Oct 26 '14 at 17:25

4 Answers4

11

Yes we can. No FFI library needed, no restriction to C calls, only pure C++11.

#include <iostream>
#include <list>
#include <iostream>
#include <boost/any.hpp>

template <typename T>
auto fetch_back(T& t) -> typename std::remove_reference<decltype(t.back())>::type
{
    typename std::remove_reference<decltype(t.back())>::type ret = t.back();
    t.pop_back();
    return ret;
}

template <typename X>
struct any_ref_cast
{
    X do_cast(boost::any y)
    {
        return boost::any_cast<X>(y);
    }
};

template <typename X>
struct any_ref_cast<X&>
{
    X& do_cast(boost::any y)
    {
        std::reference_wrapper<X> ref = boost::any_cast<std::reference_wrapper<X>>(y);
        return ref.get();
    }
};

template <typename X>
struct any_ref_cast<const X&>
{
    const X& do_cast(boost::any y)
    {
        std::reference_wrapper<const X> ref = boost::any_cast<std::reference_wrapper<const X>>(y);
        return ref.get();
    }
};

template <typename Ret, typename...Arg>
Ret call (Ret (*func)(Arg...), std::list<boost::any> args)
{
    if (sizeof...(Arg) != args.size())
        throw "Argument number mismatch!";

    return func(any_ref_cast<Arg>().do_cast(fetch_back(args))...);
}

int foo(int x, double y, const std::string& z, std::string& w)
{
    std::cout << "foo called : " << x << " " << y << " " << z << " " << w << std::endl;
    return 42;
}

Test drive:

int main ()
{
    std::list<boost::any> args;
    args.push_back(1);
    args.push_back(4.56);
    const std::string yyy("abc");
    std::string zzz("123");
    args.push_back(std::cref(yyy));
    args.push_back(std::ref(zzz));
    call(foo, args);
}

Exercise for the reader: implement registerNativeFunction in three easy steps.

  1. Create an abstract base class with a pure call method that accepts a list of boost::any, call it AbstractFunction
  2. Create a variadic class template that inherits AbstractFunction and adds a pointer to a concrete-type function (or std::function). Implement call in terms of that function.
  3. Create an map<string, AbstractFunction*> (use smart pointers actually).

Drawback: totally cannot call variadic C-style functions (e.g. printf and friends) with this method. There is also no support for implicit argument conversions. If you pass an int to a function that requires a double, it will throw an exception (which is slightly better than a core dump you can get with a dynamic solution). It is possible to partially solve this for a finite fixed set of conversions by specializing any_ref_cast.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Thanks for the help. You gave me a really good glue on what I am now trying to do. I am not so good with the new c++11 variadic templates for now, but I think I just understand them(I came back to C++ a few weeks ago). – ruabmbua Oct 26 '14 at 18:19
  • Do you know, how much overhead this uses? B.t.w. it works already. I only removed the unnecessary second mention of typename std::remove_reference::type and replaced it with auto. – ruabmbua Oct 26 '14 at 18:34
  • Should not have any significant overhead compared to any other method, including hand-written wrappers. – n. m. could be an AI Oct 26 '14 at 18:38
  • I did not know, that such a thing like trailing return existed in C++. (Okay I think I read it one time in bjarnes c++11 book, but I never used it) – ruabmbua Oct 26 '14 at 18:38
  • It is unlikely this mechanism, which requires compile time binding, is more suitable than the dynamic binding solution offered in the other answer. I thought you stated your function prototypes would be unknown until runtime? – jxh Oct 26 '14 at 18:54
  • @jxh Which other answer offers dynamic binding? I only can find one other answer and it doesn't seem to offer any solution at all. If it does, and I fail to see it, kindly point me to a [demo](http://coliru.stacked-crooked.com/a/48951a914ee48495). – n. m. could be an AI Oct 26 '14 at 19:02
  • @jxh If you mean the comment to use `libffi`, then yes it is a solution and yes it does offer dynamic bindings, but for the price of no type safety (i.e. the user can crash the interpreter or get undefined results by supplying a wrong argument list to a call). – n. m. could be an AI Oct 26 '14 at 19:10
  • I got a problem here. The solution works with all primitive data types as arguments, but fails with std::string? – ruabmbua Oct 26 '14 at 19:10
  • See the [demo](http://coliru.stacked-crooked.com/a/48951a914ee48495). It does use `std::string` successfully. You do need to modify the code to support *reference* arguments (use `std::forward` and `Args&&` where appropriate, I didn't have time to write it all properly; I might be able to correct it later today). – n. m. could be an AI Oct 26 '14 at 19:14
  • I had to inverse the order of the arguments. That was the problem ^^. – ruabmbua Oct 26 '14 at 19:24
  • Hm, const references work with no modification to the code, but non-const references don't, this looks like a boost::any limitation. – n. m. could be an AI Oct 26 '14 at 19:25
  • Or replace fetch_back with fetch_front. – ruabmbua Oct 26 '14 at 19:30
  • I have added reference and const reference support for fun (no "universal" references `&&`, but hey, it's just a PoC). – n. m. could be an AI Oct 26 '14 at 19:47
0

The way to do this is to use pointers to functions:

void (*native)(int a, int b) ;

The problem you will face is finding the address of the function to store in the pointer is system dependent.

On Windoze, you will probably be loading a DLL, finding the address of the function by name within the DLL, then store that point in native to call the function.

user3344003
  • 20,574
  • 3
  • 26
  • 62
  • That does not work for my problem. The function pointer defines the prototype of each function which can be called from it, but I need to generate the prototype dynamically. – ruabmbua Oct 26 '14 at 16:53
  • @ruabmbua Then that's not going to be portable. That's gonna be heavily platform-dependent. If you don't want to write all the fiddly stuff yourself, you can use libffi, as mentioned in another comment. – The Paramagnetic Croissant Oct 26 '14 at 16:57
  • 2
    This is not what the OP asks. – n. m. could be an AI Oct 26 '14 at 17:11
  • Do you have a finite number of prototypes or infinite? Your question suggests the number is finite and that you could use multiple pointers to functions. If not, I would suggest variable argument lists. – user3344003 Oct 26 '14 at 17:23
  • @user3344003 I have infinite different prototypes, because I do not know, how they look like. The interpreter should be reusable in many projects without changing anything of the source code. – ruabmbua Oct 26 '14 at 18:03
0

In pure standard C++ (or C; see n1570 or n3337 or some newer standard specification, a document written in English), the set of functions is fixed -so cannot change-, and given by the union of all your translation units (and by those from the standard C or C++ library). And in pure standard C++ or C, a function pointer is allowed only to point to some pre-existing function (otherwise it is undefined behavior), when you use it for indirect calls. All functions are, in standard C++ (or C), known at "compile-time", and practically declared in some translation unit (and often implemented in another one, or in some external library).

BTW, when coding an interpreter (for some scripting language), you don't need to grow the set of your (C or C++) functions. You just need to have (generic) interpreting functions coded in C or C++ dealing with some representation of the interpreted scripting code (which, from the point of view of your C++ or C program, is some data), perhaps an AST or some bytecode. For example, a Unix shell, or a Lua or Guile interpreter, don't create C or C++ functions. You could embed Lua or Guile in your program.

However, you might be interested in generating or creating new (C or C++) functions at runtime, for example when compiling your scripting code into C (a common practice) or in machine code. This is not possible in pure standard C or C++, but is practically possible on many implementations, with help from the operating system (at least to grow or add code segments, i.e. new machine code, in your virtual address space).

(notice that any mechanism able to create functions at runtime is outside of the C or C++ standard, and would return function pointers to new machine code)

See also this answer (to a very related question, for C; but you could adapt it for C++), detailing how that is practically possible (notably on Linux).

BTW, libffi is by itself not a way of creating new (C, C++, or machine code) functions, but of calling existing functions of an arbitrary signature with arbitrary arguments.

This means, that I need a way, to generate argument lists of C functions at runtime.

The libffi is only doing that. It knows your ABI (and is partly coded in assembler).

Notice that if your set of functions is fixed (so finite), their signatures are also in a finite set, then you don't really need libffi (because you could special case all your signatures, so your signatures are not arbitrary), even if it could be convenient.

Once you are adding new functions at runtime of arbitrary signatures, libffi or an equivalent mechanism is absolutely needed (because even the set of called signatures could grow).

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • I'm interested in finding which section of the standard says that the set of functions is fixed, could you point me in the direction? – Learath2 Mar 21 '20 at 17:14
  • In C, a function pointer has to point to an existing function... In any program, only a finite set of them exist. `dlsym` is unspecified behavior in standard C, even if POSIX defines it – Basile Starynkevitch Mar 21 '20 at 17:33
  • C99 6.5.2.2p9 Says that a call through an expression that doesn't match would be UB. However, I can't find any justification for "In C, a function pointer has to point to an existing function..." – Learath2 Mar 21 '20 at 22:52
0

Many ways to do this.

  1. Use boost (see first answer)
  2. Use std::bind. Similar to boost but more simple
  3. Use C function pointer.

example

#define DYNAMIC(p,arg,n) {\
if(0==n) ((void (*)())p)();\
else if(1==n) ((void (*)(int))p)(arg[0]);\
else if(2==n) ((void (*)(int, int))p)(arg[0], arg[1]);\
else if(3==n) ((void (*)(int, int, int))p)(arg[0], arg[1], arg[2]);\
}

void fun0()
{
    printf("no arg \n");
}

void fun1(int a)
{
    printf("arg %x\n", a);
}

void fun2(int a, const char *b)
{
    printf("arg %x %s \n", a,b);
}

void fun3(int a,const char *b, int c)
{
    printf("arg %x %s %x\n", a, b, c);
}

    int a = 0xabcd;
    const char* b = "test dynamic function";
    int c = 0xcdef;

    int d[] = { 1,(int)b,c) };
    DYNAMIC(fun0, d, 0);
    DYNAMIC(fun1, d, 1);
    DYNAMIC(fun2, d, 2);
    DYNAMIC(fun3, d, 3);
  1. Even use some asm code more flexible:
void call(void* p, uint32_t arg[], int n)
{
    for (int i = n-1; i >-1; --i)
    {
        uint32_t u = arg[i];
        __asm push u
    }
    __asm call p
    int n2 = n * 4;
    __asm add esp,n2
}
  
 int a = 0xabcd;
 const char* b = "test dynamic function";
 int c = 0xcdef; 
int d[] = { 1,(int)b,c) };
call(fun3, d, 3);

blackshadow
  • 77
  • 1
  • 4