-1

Suppose I generate a C program during execution time:

source = "int add_x_y(int x, int y){ return x + y; }";
source_size = 42;

I want the following function:

void* compile(char* source, int source_size);

Such that:

int (*f)(int,int) = compile(source, source_size);
printf("%d\n",f(2,3));

Outputs:

5

And compile can't depend on external tools (compilers), as I'd like to use it in emscripten (which converts a C program to a .js file).

Is that possible?

MaiaVictor
  • 51,090
  • 44
  • 144
  • 286
  • When you say it can't depend on external tools, are you including GCC (or your compiler of choice)? – Holly Oct 15 '14 at 21:35
  • 1
    Google keywords: "runtime compilation", "just in time compilation", "runtime expression evaluation" – salezica Oct 15 '14 at 21:37
  • Yes, I'd like a lightweight, simple compiler that did not to depend on GCC, since that isn't available inside a browser. But if you have an answer absolutely requires `GCC`, please post it, as that is better than nothing (I can adapt my needs to compile on the server, for example). – MaiaVictor Oct 15 '14 at 21:37
  • What you are describing is the combined behavior of a compiler and linker. Although in principle you could implement such a thing yourself, it is hardly practical. – John Bollinger Oct 15 '14 at 21:38
  • Note that it's an entirely different thing to convert C source to a different high-level language, as I guess emscripten does. – John Bollinger Oct 15 '14 at 21:39
  • 2
    Practicality aside, are you quite sure that casting a `void*` to a function pointer is defined behaviour? – EOF Oct 15 '14 at 21:39
  • Of course @EOF asks a rhetorical question. The behavior is not defined. As long as we are ignoring practicality, however, such a `compile()` function could return a function pointer, such as an `int (*)()`. – John Bollinger Oct 15 '14 at 21:43
  • But what if the compiled function doesn't return an int? – MaiaVictor Oct 15 '14 at 21:44
  • C has no syntax for a pointer to a function with unspecified return type. Therefore, if you needed to support functions with different return types then you would need a different compile function for each. Furthermore, you need to be sure to call functions with the correct number of arguments, else undefined behavior results. – John Bollinger Oct 15 '14 at 21:47
  • Well, the *obvious* solution is to make the `compile()` function return a varargs-function that itself returns a `void*`. That should cover all bases, and we've already cast aside practicality... – EOF Oct 15 '14 at 21:48
  • Never mind, I just realized the functions will always have the same return type. Still, I don't understand why it is not practical. – MaiaVictor Oct 15 '14 at 22:00
  • 2
    C is not the right language for this. It is difficult (and slow) to parse & compile, compilers for C are complex (and thus big), and the language is not designed for any kind of JIT. Just look at a modern C compiler like gcc, and ask yourself if luaJIT wouldn't be better for your needs. – EOF Oct 15 '14 at 22:04
  • I understand, but **the reason** I need this is for enabling JIT for a language I am developing. I can compile a function on my language to C code on runtime, but I need a way to get it to work. I could compile it to asm, if necessary, some alternative bytecode, or whatever, I don't know. I just need the ability to run native code. That is the whole point of the question. Pointing to Lua in this case is the same as suggesting V8 to the guy behind LuaJit! – MaiaVictor Oct 16 '14 at 00:27
  • "I need this is for enabling JIT for a language I am developing" -- So you want to do JIT by generating and compiling C source? A bit of a performance dog, don't you think? Anyway, take a look at http://stackoverflow.com/questions/584714/is-there-an-interpreter-for-c ... but I think you're quite confused; if you need to run native code, you shouldn't be generating C source at runtime. – Jim Balter Oct 16 '14 at 01:19
  • Sure, so, if you do know it, feel encouraged to answer with the correct compile target as well as the way to execute it at runtime! – MaiaVictor Oct 16 '14 at 02:24

2 Answers2

1

Someone else can probably fill in some of the specifics better than I, but if you don't mind calling out to GCC or linking to it, it should be doable. If you write the code out to a file, then compile the file into a shared library (.SO). From there, it's a simple matter of loading the shared library and getting the address of the desired symbol.

Holly
  • 5,270
  • 1
  • 24
  • 27
  • I see, thanks, that solves it to a great extent. But is there any option without GCC and files? – MaiaVictor Oct 15 '14 at 21:44
  • @Viclib: it depends. Can "*any* option" go as far as writing your own compiler? Does the code need to be executable, or can it be JIT byte-code as well? Does it need to accept valid C only? (Because your sample is not.) – Jongware Oct 15 '14 at 21:59
  • Writing my own compiler, no. It can be JIT byte-code. Yes, only valid C. – MaiaVictor Oct 15 '14 at 22:01
  • 1
    @Viclib: my last question was because your sample program doesn't have a `main` entry point. You'd compile a function, but that in itself doesn't *do* anything. – Jongware Oct 15 '14 at 23:08
0

It is operating system and processor specific. I suppose you are on Linux x86-64 (64 bits x86) or ia32 (32 bits x86)

You could use tinycc (it is a compiler which compiles quickly C code to very slow and unoptimized machine code) which provides a library libtcc containing a tcc_compile_string function.

You could use a JIT-compiling library like libjit, GNU lightning, asmjit, LLVM (and GCC 5 will have JIT-ing abilities).

And you simply could write your string to some temporary C file /tmp/genfoo.c (if that file sits in a tmpfs filesystem, no real disk IO is involved, so it is fast) and then fork a real command:

gcc -Wall -fPIC -shared -O /tmp/genfoo.c -o /tmp/genfoo.so

then dlopen(3) the produced /tmp/genfoo.so shared object (and dlsym to get a function pointer from its name).

If you want performance of the generated code, you need a real optimizing compiler like GCC or Clang/LLVM; the overhead of writing a temporary source file (and parsing it in the compiler) is negligible: most of the work is inside the compiler in optimization passes. Generating C code is practical, specially when you want the generated code to be optimized by some C compiler.

Notice that all these techniques probably won't work inside emscripten, simply because you probably cannot cast a data pointer to a function pointer there (legally that cast is probably unspecified behavior in C99, but all the approaches I mention above need it, and you are doing such a cast in your question)! If you need to generate code inside a browser, you probably need to generate some Javascript or a subset of it (e.g. for asm.js). See calling Javascript from C/C++ in Emscripten

If you are developing a language to be run inside the browser, make that language generate some Javascript (e.g. asm.js).

See also NaCl (Native Client on Google browsers)

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547