2

Test platform is on Linux 32 bit. (But certain solution on windows 32 bit is also welcome)

Here is a c code snippet:

int a = 0;
printf("%d\n", a);

And if I use gcc to generate assembly code

gcc -S test.c

Then I will get:

      movl    $0, 28(%esp)
      movl    28(%esp), %eax
      movl    %eax, 4(%esp)
      movl    $.LC0, (%esp)
      call    printf
      leave
      ret

And this assembly code needs linking to libc to work(because of the call printf)

My question is :

Is it possible to convert C to asm with only explicit using system call automatically, without using libc?

Like this:

    pop ecx        
    add ecx,host_msg-host_reloc
    mov eax,4
    mov ebx,1
     mov edx,host_msg_len
    int 80h
    mov eax,1
     xor ebx,ebx
     int 80h

Directly call the int 80h software interrupt.

Is it possible? If so, is there any tool on this issue?

Thank you!

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
lllllllllllll
  • 8,519
  • 9
  • 45
  • 80
  • I don't understand the question. The `gcc` compiler is a good tool to convert *C* code to assembly (but there are others). And you might use `gcc -O -fverbose-asm -S` to get some better assembly code. Or is your question about how to make syscalls without any C code? Then read the [Linux Assembler HowTo](http://tldp.org/HOWTO/Assembly-HOWTO/‎) – Basile Starynkevitch Jan 22 '14 at 18:30
  • You haven't clearly explained what's wrong with the gcc method. Are you just talking about the difference in representation? see: http://stackoverflow.com/q/972602/10396 – AShelly Jan 22 '14 at 18:31
  • @BasileStarynkevitch Hi, I am sorry and I modified the question – lllllllllllll Jan 22 '14 at 18:39
  • @AShelly Hi, sorry and I modified the question... – lllllllllllll Jan 22 '14 at 18:39
  • Reverse -- compile and create executable then use disassembler for ASM code you can use objdump. – Grijesh Chauhan Jan 22 '14 at 18:40
  • @GrijeshChauhan , is there anyway without the **reverse** step, cause basically reverse would introduce some uncertainty(I have some IDA pro experience..).... – lllllllllllll Jan 22 '14 at 18:44
  • I'm not aware of any C compiler that would statically insert library bodies into generated assembly. You might examine alternative libc(s), but they aren't going to insert their implementations (just calls). – Elliott Frisch Jan 22 '14 at 18:46
  • @computereasy note some function's implementation like printf is system dependent so if you call the function it will link to the library You have to manually replace interrupt information(or function body). -- Why do you wants this? Any kind of infection analysis. Malware/Benign? – Grijesh Chauhan Jan 22 '14 at 18:53
  • **Why do you ask**? Are you coding your own `libc`? – Basile Starynkevitch Jan 22 '14 at 20:35
  • Your C code contains a call to `printf`, therefore, unless the compiler is very clever, the generated assembly code will contain a call to `printf`. Directly calling the `int 80h` software interrupt will most likely not do the same thing; I believe only the OS kernel is permitted to do that. And even if it worked, it would not duplicate `printf`'s ability to send its output to `stdout`, which could be the console, a terminal emulator, a file, a pipe, or any of several other things. – Keith Thompson Jan 22 '14 at 21:03
  • @GrijeshChauhan well, I am asked to use polymorphic engine to do some benign work, but basically from the engines I have seen, all of them can only mutate assembly code with system calls, assembly code use some libc function are not allowed, because basically engine can not deal with PLT/GOT – lllllllllllll Jan 22 '14 at 21:14
  • @BasileStarynkevitch well, I am asked to use polymorphic engine to do some benign work, but basically from the engines I have seen, all of them can only mutate assembly code with system calls, assembly code use some libc function are not allowed, because basically engine can not deal with PLT/GOT – lllllllllllll Jan 22 '14 at 21:15
  • @KeithThompson Hi, thank you and I explain why I am trying to do this in the comment above. So it seems that I should adjust my test approach, from "translate widely used c project into assembly code and run" to "use widely used assembly project and run"... :( – lllllllllllll Jan 22 '14 at 21:18
  • 1
    Define what *polymorphic engine* means to you. – Basile Starynkevitch Jan 22 '14 at 21:23
  • 1
    @computereasy: please edit your question to improve it. Add what you explained in comments back into the question! – Basile Starynkevitch Jan 22 '14 at 21:27
  • @computereasy Which engine you are using (I think you are using some metamorphic engine..) Then why not to compile, create executable then pass disassembled code to Engine? – Grijesh Chauhan Jan 23 '14 at 04:57
  • @GrijeshChauhan I use "Linux Mutation Engine". and I don't think it is metamorphic...:( I download several engines on the VXHeavens and basically most of them are 10 years ago, using tasm on win32 or DOS.. I am trying to do this on Linux because I just try to demonstrate its concept and I am more familiar with Linux.. "Linux Mutation Engine" is the only one I can find... any suggestions on a good polymorphic/metamorphic engine...? – lllllllllllll Jan 23 '14 at 06:16
  • @GrijeshChauhan Why not disassembly ? In my humble opinion, disassembled process make error from time to time, and what's worse, most time you need to modify the disassembled code in order to make it compilable even using IDA pro(I have done an immature work on that, and it is tedious...) – lllllllllllll Jan 23 '14 at 06:17
  • @computereasy [Dr. Mark Stamp](http://cs.sjsu.edu/~stamp/) have written his own- engine to create metamorphic variants. You can mail him. Generally IDA pro fails to disassemble when file in encrypted (by some polymorphic engine) to dis-properly you have to unpack file. Try with [`Ether: Malware Analysis via Hardware Virtualization Extensions`](http://ether.gtisc.gatech.edu/) – Grijesh Chauhan Jan 23 '14 at 06:25

3 Answers3

9

Not from that source code. A call to printf() cannot be converted by the compiler to a call to the write system call - the printf() library function contains a significant amount of logic which is not present in the system call (such as processing the format string and converting integer and floating-point numbers to strings).

It is possible to generate system calls directly, but only by using inline assembly. For instance, to generate a call to _exit(0) (not quite the same as exit()!), you would write:

#include <asm/unistd.h>
...
int retval;
asm("int $0x80" : "=a" (retval) : "a" (__NR_exit_group), "b" (0) : "memory");

For more information on GCC inline assembly, particularly on the constraints I'm using here to map variables to registers, please read the GCC Inline Assembly HOWTO. It's rather old, but still perfectly relevant.

Note that doing this is not recommended. The exact calling conventions for system calls (e.g, which registers are used for the call number and arguments, how errors are returned, etc) are different on different architectures, operating systems, and even between 32-bit and 64-bit x86. Writing code this way will make it very difficult to maintain.

Timothy Baldwin
  • 3,551
  • 1
  • 14
  • 23
6

You can certainly compile C code to assembly without linking to libc, but you can't use the C library functions. Libc's entire purpose IS to provide the interface from C library functions to Linux system calls (or Windows, or whatever system you're on). So, if you didn't want to use libc, you would have to write your own wrappers to the system calls.

chbaker0
  • 1,758
  • 2
  • 13
  • 27
3

If you compile some C code which does not use any function from the C library (e.g. does not use printf or malloc etc etc....) in the free-standing mode of the GCC compiler (i.e. with -ffreestanding flag to gcc), you'll need either to call some assembler function (from some other object or library) or to use asm instruction (you won't be able to do any kind of input output without making a syscall).

Read also the Assembly HowTo, the x86 calling conventions and the ABI relevant to your kernel (probably x86-64 ABI) and understand quite well what are system calls, starting with syscalls(2) and what is the VDSO (int 80 is not the best way to make syscalls these days, SYSENTER is often better). Study the source code of some libc, in particular of MUSL libc (whose source code is very readable).

On Windows (which is not free software and which I don't know) the question could be much more difficult: I am not sure that the system call level is exactly and completely documented.

The libffi enables you to call arbitrary functions from C. You could also cast function pointers from dlsym(3). You could consider JIT techniques (e.g. libjit, GNU lightning, asmjit etc...).

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547