60

Is it possible to compile a C++ (or the like) program without generating the executable file but writing it and executing it directly from memory?

For example with GCC and clang, something that has a similar effect to:

c++ hello.cpp -o hello.x && ./hello.x $@ && rm -f hello.x

In the command line.

But without the burden of writing an executable to disk to immediately load/rerun it.

(If possible, the procedure may not use disk space or at least not space in the current directory which might be read-only).

Fahim Uz Zaman
  • 444
  • 3
  • 6
alfC
  • 14,261
  • 4
  • 67
  • 118
  • @David Heffernan, alfC never explicitly specified using Linux. He just provided a `gcc` build process as an example of what he wanted to do. – Matt Kline Dec 03 '12 at 19:49
  • 2
    @TJD Using a RAM disk to decrease build times is a fairly common idea. See [this thread](http://stackoverflow.com/questions/3442170/is-it-sensible-to-use-a-ramdisk-on-a-build-server) for example. – Matt Kline Dec 03 '12 at 19:54
  • 2
    @slavik262 The question specifically mentions Linux. Anyway, any answer is going to be heavily OS specific. – David Heffernan Dec 03 '12 at 19:55
  • @slavik262 Doesn't the OS disk cache remove the need for RAM disks? – David Heffernan Dec 03 '12 at 19:57
  • @slavik262, Yes using RAM disk is great because it speeds the access to the various source and intermediate object files. This question is just asking about the burden of writing the final executable, not putting the whole build tree in RAM. – TJD Dec 03 '12 at 19:57
  • Thank you all for the interest. If a compiler supported this, it would be a compiler-dependent question, but not necessarily a OS dependent question. I'll remove the Linux reference (just and example) if that is confusing. Also, thank you for the RAM disk suggestions. My motivation is not for speeding up compilation (optimization) necessarily, but to be able to compile and run in situations where write-to-disk (globally or locally) is not guaranteed although the sources are available; so I was looking for a compiler/tool-based solution. – alfC Dec 03 '12 at 22:00
  • Also, let me clarify that although closely related, and similar in effect, this question is not about a C++ interpreter. – alfC Dec 03 '12 at 23:53
  • @TJD, no, I don't want this for optimization particularly. – alfC Dec 04 '12 at 00:00
  • @alfC What do you mean with _'Lothar Krause's answer seems to be in the right direction but it doesn't have enough detail'_? You didn't at least put any comment on this answer to tell what you're actually missing. IMHO Lothar Krause's answer clearly explains the solution. How to setup a pipe and using the executed command's results via a file handle is a completely different question. You should try to search for a sample for usage of `pipe()` and `fexecve()` or read the documentation. – πάντα ῥεῖ Dec 17 '12 at 19:35
  • I guess you put in words what I see missing (in part because of ignorance), "How to setup a pipe and using the executed command's results via a file handle"? Also file descriptor is something I didn't know about. In my ignorance it looked that his answer was just a proof of concept not a complete answer, but probably I am missing something. – alfC Dec 18 '12 at 08:32
  • @alfC You wrote that write-to-disk (globally or locally) is not guranteed but in your question you state that it was possible to use temporary files. So do we have no gurantees for the temporary files either? – Lothar Dec 20 '12 at 08:00
  • @LotharKrause, I mean that if the compiler needs to write temporary files and finds the location to do it then is fine. But maybe that is confusing. I'll remove it from the question. – alfC Dec 20 '12 at 18:23
  • I don't understand why writing to some fast file system (`tmpfs` or on a fast disk, e.g. SSD) is not enough: most of the time is spent in compiling that C++ generated code, so IO time is negligible. – Basile Starynkevitch Dec 22 '12 at 19:07
  • You should explain why you don't want to go thru files. I don't understand why you want to avoid files. (Performance or time is not relevant; most of the time is spent inside `g++` once you have `-O` or `-O2` ....) – Basile Starynkevitch Dec 22 '12 at 19:36
  • 4
    I know you are probably interested in methods that will work with current tools, but historically the answer is absolutely yes. The method is called "compile and go." It is discussed in older compiler textoobks and has been around since at least the 60's. The idea was to eliminate file system delays, and it worked well. E.g. from mid-80's to mid-90's there were several versions of Turbo Pascal that did this. They were blazingly fast: 10's of thousands of lines per second on the 80486 processors of the day, when file-based compilation schemes were doing thousands or hundreds. – Gene Dec 24 '12 at 01:17
  • Compile and go from the 1960s is quite similar to today's Just In Time compilation – Basile Starynkevitch Dec 24 '12 at 06:42
  • @BasileStarynkevitch, for example suppose I want to compile/execute without relying in having a writable space. Or avoid intermediate files. – alfC Apr 14 '14 at 02:06

7 Answers7

52

Possible? Not the way you seem to wish. The task has two parts:

1) How to get the binary into memory

When we specify /dev/stdout as output file in Linux we can then pipe into our program x0 that reads an executable from stdin and executes it:

  gcc -pipe YourFiles1.cpp YourFile2.cpp -o/dev/stdout -Wall | ./x0

In x0 we can just read from stdin until reaching the end of the file:

int main(int argc, const char ** argv)
{
    const int stdin = 0;
    size_t ntotal = 0;
    char * buf = 0;
    while(true)
    {
        /* increasing buffer size dynamically since we do not know how many bytes to read */
        buf = (char*)realloc(buf, ntotal+4096*sizeof(char));
        int nread = read(stdin, buf+ntotal, 4096); 
        if (nread<0) break;
        ntotal += nread;
    }
    memexec(buf, ntotal, argv); 
}

It would also be possible for x0 directly execute the compiler and read the output. This question has been answered here: Redirecting exec output to a buffer or file

Caveat: I just figured out that for some strange reason this does not work when I use pipe | but works when I use the x0 < foo.

Note: If you are willing to modify your compiler or you do JIT like LLVM, clang and other frameworks you could directly generate executable code. However for the rest of this discussion I assume you want to use an existing compiler.

Note: Execution via temporary file

Other programs such as UPX achieve a similar behavior by executing a temporary file, this is easier and more portable than the approach outlined below. On systems where /tmp is mapped to a RAM disk for example typical servers, the temporary file will be memory based anyway.

#include<cstring> // size_t
#include <fcntl.h>
#include <stdio.h> // perror
#include <stdlib.h> // mkostemp
#include <sys/stat.h> // O_WRONLY
#include <unistd.h> // read
int memexec(void * exe, size_t exe_size, const char * argv)
{
    /* random temporary file name in /tmp */
    char name[15] = "/tmp/fooXXXXXX"; 
    /* creates temporary file, returns writeable file descriptor */
    int fd_wr = mkostemp(name,  O_WRONLY);
    /* makes file executable and readonly */
    chmod(name, S_IRUSR | S_IXUSR);
    /* creates read-only file descriptor before deleting the file */
    int fd_ro = open(name, O_RDONLY);
    /* removes file from file system, kernel buffers content in memory until all fd closed */
    unlink(name);
    /* writes executable to file */
    write(fd_wr, exe, exe_size);
    /* fexecve will not work as long as there in a open writeable file descriptor */
    close(fd_wr);
    char *const newenviron[] = { NULL };
    /* -fpermissive */
    fexecve(fd_ro, argv, newenviron);
    perror("failed");
}

Caveat: Error handling is left out for clarities sake. Includes for sake of brevity.

Note: By combining step main() and memexec() into a single function and using splice(2) for copying directly between stdin and fd_wr the program could be significantly optimized.

2) Execution directly from memory

One does not simply load and execute an ELF binary from memory. Some preparation, mostly related to dynamic linking, has to happen. There is a lot of material explaining the various steps of the ELF linking process and studying it makes me believe that theoretically possible. See for example this closely related question on SO however there seems not to exist a working solution.

Update UserModeExec seems to come very close.

Writing a working implementation would be very time consuming, and surely raise some interesting questions in its own right. I like to believe this is by design: for most applications it is strongly undesirable to (accidentially) execute its input data because it allows code injection.

What happens exactly when an ELF is executed? Normally the kernel receives a file name and then creates a process, loads and maps the different sections of the executable into memory, performs a lot of sanity checks and marks it as executable before passing control and a file name back to the run-time linker ld-linux.so (part of libc). The takes care of relocating functions, handling additional libraries, setting up global objects and jumping to the executables entry point. AIU this heavy lifting is done by dl_main() (implemented in libc/elf/rtld.c).

Even fexecve is implemented using a file in /proc and it is this need for a file name that leads us to reimplement parts of this linking process.

Libraries

Reading

Related Questions at SO

So it seems possible, you decide whether is also practical.

alfC
  • 14,261
  • 4
  • 67
  • 118
Lothar
  • 860
  • 6
  • 21
  • ok, so I issued the first command and get the executable output (binary codes) to the screen. That looks promising, where do I pipe this to? Do I have to pipe it a special executable made from your C code? Where are `pipe`, `fork`, `exec` used? How the piped binary code is related to `"fooXXXXX"? – alfC Dec 18 '12 at 08:27
  • You are right, maybe I made somewhat of a jump. The piped binary code will be written using the file descriptor `fd` which will have a random name starting with foo. Hope I can provide an example later. In any case my starting point would be this: http://users.encs.concordia.ca/~mia/tutorials/coen346/IPC_threads/pipe_parent_child.html – Lothar Dec 18 '12 at 10:28
  • Awesome answer, first I wonder why something like your program is not one of the programs included in the operating system, that is a program that takes a binary stream and executes it. Second there it seems to be a problem with the `main` part of the code as the `./x0` hangs while reading a stream. I had to change to `if (nread != 0) break;` in order to finish reading the stream. And with this change I get a message `failed: Exec format error`. I am trying implementation 2) so far. – alfC Dec 22 '12 at 03:40
  • @alfC I assume it is loading the file. Unfortunately I am away from my computer. Try using a fixed size read on `char buf[SOME_BIG_NUMBER]` – Lothar Dec 22 '12 at 09:19
  • @alfC: because as cool as this answer is, something like `x0` is the biggest security hole I've ever seen. Especially considering how easy it would be to hack. I don't want that thing on my system. Never. They need to be integrated into one process so `x0` doesn't exist. Otherwise, your security is screwed. – Linuxios Dec 22 '12 at 15:22
  • I don't feel this answer works without files, it just deploys standard tricks about temporary files. I don't see the gain w.r.t. to using files (e.g. in a `tmpfs` system). – Basile Starynkevitch Dec 22 '12 at 20:00
  • @BasileStarynkevitch, yes, it seems that option 1 still relies in creating a temporary files and it is equivalent to use the `mktemp` program. Option 2 seems more promising but I can not make it work (as the `main` part doesn't work for me). – alfC Dec 23 '12 at 22:30
24

Yes, though doing it properly requires designing significant parts of the compiler with this in mind. The LLVM guys have done this, first with a kinda-separate JIT, and later with the MC subproject. I don't think there's a ready-made tool doing it. But in principle, it's just a matter of linking to clang and llvm, passing the source to clang, and passing the IR it creates to MCJIT. Maybe a demo does this (I vaguely recall a basic C interpreter that worked like this, though I think it was based on the legacy JIT).

Edit: Found the demo I recalled. Also, there's cling, which seems to do basically what I described, but better.

  • Although not a solution and I wasn't looking for an interpreter, this is the kind of answer I was expecting. +1 – alfC Dec 03 '12 at 21:53
  • @alfC - You can do exactly what you want with LLVM JIT. Create an llvm::Module from LLVM IR code (get it by compiling your C/C++ code into LLVM IR via clang linked into your executable). Create an llvm::ExecutionEngine from that Module, and then call yourEngine->FindFunctionNamed(someFunction), and then yourEngine->runFunction(pSomeFunction) via JIT. It can even give you a return value. As long as you don't blow away your ExecutionEngine, the JITed code can be called again & again, faster on subsequent calls because it's JITed on the 1st call. – phonetagger Dec 20 '12 at 16:30
  • as far as I know there is gcc JIT project and Android NDk-compatible ones. The problem is that its all implementation and os specific – Swift - Friday Pie Oct 05 '20 at 12:34
22

Linux can create virtual file systems in RAM using tempfs. For example, I have my tmp directory set up in my file system table like so:

tmpfs       /tmp    tmpfs   nodev,nosuid    0   0

Using this, any files I put in /tmp are stored in my RAM.

Windows doesn't seem to have any "official" way of doing this, but has many third-party options.

Without this "RAM disk" concept, you would likely have to heavily modify a compiler and linker to operate completely in memory.

Matt Kline
  • 10,149
  • 7
  • 50
  • 87
  • Hear hear. I believe this is sometimes known as a "RAM disk". – Ross Rogers Dec 03 '12 at 19:46
  • Actually, the modifications needed on Windows would be fairly small. Just pass `FILE_ATTRIBUTE_TEMPORARY` when creating the output file. This suppresses the flush to disk, keeping the file in cache. The documentation explicitly states that this may avoid a write if the file is deleted soon after, as is intended here. – MSalters Dec 21 '12 at 16:37
9

If you are not specifically tied to C++, you may also consider other JIT based solutions:

  • in Common Lisp SBCL is able to generate machine code on the fly
  • you could use TinyCC and its libtcc.a which emits quickly poor (i.e. unoptimized) machine code from C code in memory.
  • consider also any JITing library, e.g. libjit, GNU Lightning, LLVM, GCCJIT, asmjit
  • of course emitting C++ code on some tmpfs and compiling it...

But if you want good machine code, you'll need it to be optimized, and that is not fast (so the time to write to a filesystem is negligible).

If you are tied to C++ generated code, you need a good C++ optimizing compiler (e.g. g++ or clang++); they take significant time to compile C++ code to optimized binary, so you should generate to some file foo.cc (perhaps in a RAM file system like some tmpfs, but that would give a minor gain, since most of the time is spent inside g++ or clang++ optimization passes, not reading from disk), then compile that foo.cc to foo.so (using perhaps make, or at least forking g++ -Wall -shared -O2 foo.cc -o foo.so, perhaps with additional libraries). At last have your main program dlopen that generated foo.so. FWIW, MELT was doing exactly that, and on Linux workstation the manydl.c program shows that a process can generate then dlopen(3) many hundred thousands of temporary plugins, each one being obtained by generating a temporary C file and compiling it. For C++ read the C++ dlopen mini HOWTO.

Alternatively, generate a self-contained source program foobar.cc, compile it to an executable foobarbin e.g. with g++ -O2 foobar.cc -o foobarbin and execute with execve that foobarbin executable binary

When generating C++ code, you may want to avoid generating tiny C++ source files (e.g. a dozen lines only; if possible, generate C++ files of a few hundred lines at least; unless lots of template expansion happens thru extensive use of existing C++ containers, where generating a small C++ function combining them makes sense). For instance, try if possible to put several generated C++ functions in the same generated C++ file (but avoid having very big generated C++ functions, e.g. 10KLOC in a single function; they take a lot of time to be compiled by GCC). You could consider, if relevant, to have only one single #include in that generated C++ file, and pre-compile that commonly included header.

Jacques Pitrat's book Artificial Beings, the conscience of a conscious machine (ISBN 9781848211018) explains in details why generating code at runtime is useful (in symbolic artificial intelligence systems like his CAIA system). The RefPerSys project is trying to follow that idea and generate some C++ code (and hopefully, more and more of it) at runtime. Partial evaluation is a relevant concept.

Your software is likely to spend more CPU time in generating C++ code than GCC in compiling it.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
2

tcc compiler "-run" option allows for exactly this, compile into memory, run there and finally discard the compiled stuff. No filesystem space needed. "tcc -run" can be used in shebang to allow for C script, from tcc man page:

#!/usr/local/bin/tcc -run
#include <stdio.h>

int main()
{
    printf("Hello World\n");
    return 0;
}

C scripts allow for mixed bash/C scripts, with "tcc -run" not needing any temporary space:

#!/bin/bash

echo "foo"
sed -n "/^\/\*\*$/,\$p" $0 | tcc -run -

exit
/**
*/
#include <stdio.h>

int main()
{
    printf("bar\n");
    return 0;
}

Execution output:

$ ./shtcc2
foo
bar
$

C scripts with gcc are possible as well, but need temporary space like others mentioned to store executable. This script produces same output as the previous one:

#!/bin/bash

exc=/tmp/`basename $0`
if [ $0 -nt $exc ]; then sed -n "/^\/\*\*$/,\$p" $0 | gcc -x c - -o $exc; fi

echo "foo"
$exc

exit
/**
*/
#include <stdio.h>

int main()
{
    printf("bar\n");
    return 0;
}

C scripts with suffix ".c" are nice, headtail.c was my first ".c" file that needed to be executable:

$ echo -e "1\n2\n3\n4\n5\n6\n7" | ./headtail.c 
1
2
3
6
7
$

I like C scripts, because you just have one file, you can easily move around, and changes in bash or C part require no further action, they just work on next execution.

P.S:
The above shown "tcc -run" C script has a problem, C script stdin is not available for executed C code. Reason was that I passed extracted C code via pipe to "tcc -run". New gist run_from_memory_stdin.c does it correctly:

...
echo "foo"
tcc -run <(sed -n "/^\/\*\*$/,\$p" $0) 42
...

"foo" is printed by bash part, "bar 42" from C part (42 is passed argv[⁠1]), and piped script input gets printed from C code then:

$ route -n | ./run_from_memory_stdin.c 
foo
bar 42
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.29.58.98    0.0.0.0         UG    306    0        0 wlan1
10.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 wlan0
169.254.0.0     0.0.0.0         255.255.0.0     U     303    0        0 wlan0
172.29.58.96    0.0.0.0         255.255.255.252 U     306    0        0 wlan1
$ 
HermannSW
  • 161
  • 1
  • 8
  • Very good. Can it work for C++? – alfC Sep 16 '21 at 07:54
  • 1
    tcc is "Tiny C compiler", cannot deal with C++. I created g++ based [C++ script](https://gist.github.com/Hermann-SW/0d00f9afe0c274414220ee73e3a77120), where bash part outputs "hello " and C++ part outputs "world" (since g++ has no "-run" option, C++ executable is created under "/tmp" and executed from there). – HermannSW Sep 16 '21 at 13:56
  • Now works for C++ as well, see [C++ script section](https://github.com/Hermann-SW/memrun/tree/master/C#c-script-1) of my memrun C repo. – HermannSW Sep 19 '21 at 07:07
1

One can easily modify the compiler itself. It sounds hard first but thinking about it, it seams obvious. So modifying the compiler sources directly expose a library and make it a shared library should not take that much of afford (depending on the actual implementation).

Just replace every file access with a solution of a memory mapped file.

It is something I am about to do with compiling something transparently in the background to op codes and execute those from within Java.

-

But thinking about your original question it seams you want to speed up compilation and your edit and run cycle. First of all get a SSD-Disk you get almost memory speed (use a PCI version) and lets say its C we are talking about. C does this linking step resulting in very complex operations that are likely to take more time than reading and writing from / to disk. So just put everything on SSD and live with the lag.

Martin Kersten
  • 5,127
  • 8
  • 46
  • 77
  • Thanks for the idea, I am using SSD already, but that is not the only point. As you said the compiler could behave like and interpreter and execute just after compilation without creating files here and there. – alfC Apr 07 '15 at 05:13
  • Well if my try for the ASM in Java would work, someone might do the same with C. Would be nice. http://stackoverflow.com/questions/29481317/inline-asm-in-java – Martin Kersten Apr 07 '15 at 09:05
0

Finally the answer to OP question is yes!

I found memrun repo from guitmz, that demoed running (x86_64) ELF from memory, with golang and assembler. I forked that, and provided C version of memrun, that runs ELF binaries (verified on x86_64 and armv7l), either from standard input, or via first argument process substitution. The repo contains demos and documentation (memrun.c is 47 lines of code only):
https://github.com/Hermann-SW/memrun/tree/master/C#memrun

Here is simplest example, with "-o /dev/fd/1" gcc compiled ELF gets sent to stdout, and piped to memrun, which executes it:

pi@raspberrypi400:~/memrun/C $ gcc info.c -o /dev/fd/1 | ./memrun
My process ID : 20043
argv[0] : ./memrun
no argv[1]
evecve --> /usr/bin/ls -l /proc/20043/fd
total 0
lr-x------ 1 pi pi 64 Sep 18 22:27 0 -> 'pipe:[1601148]'
lrwx------ 1 pi pi 64 Sep 18 22:27 1 -> /dev/pts/4
lrwx------ 1 pi pi 64 Sep 18 22:27 2 -> /dev/pts/4
lr-x------ 1 pi pi 64 Sep 18 22:27 3 -> /proc/20043/fd
pi@raspberrypi400:~/memrun/C $ 

The reason I was interested in this topic was usage in "C script"s. run_from_memory_stdin.c demonstrates all together:

pi@raspberrypi400:~/memrun/C $ wc memrun.c | ./run_from_memory_stdin.c 
foo
bar 42
  47  141 1005 memrun.c
pi@raspberrypi400:~/memrun/C $ 

The C script producing shown output is so small ...

#!/bin/bash

echo "foo"
./memrun <(gcc -o /dev/fd/1 -x c <(sed -n "/^\/\*\*$/,\$p" $0)) 42

exit
/**
*/
#include <stdio.h>

int main(int argc, char *argv[])
{
  printf("bar %s\n", argc>1 ? argv[1] : "(undef)");

  for(int c=getchar(); EOF!=c; c=getchar())  { putchar(c); }

  return 0;
}

P.S:
I added tcc's "-run" option to gcc and g++, for details see:
https://github.com/Hermann-SW/memrun/tree/master/C#adding-tcc--run-option-to-gcc-and-g

Just nice, and nothing gets stored in filesystem:

pi@raspberrypi400:~/memrun/C $ uname -a | g++ -O3 -Wall -run demo.cpp 42
bar 42
Linux raspberrypi400 5.10.60-v7l+ #1449 SMP Wed Aug 25 15:00:44 BST 2021 armv7l GNU/Linux
pi@raspberrypi400:~/memrun/C $ 
HermannSW
  • 161
  • 1
  • 8