3

I am building simple application without glibc in linux 64bit. But I don't know how to get arguments.

I googled and I found that RDI is argc, RSI is argv. But it didn't worked.

I saw registers when _start function starts using gdb, but both RDI and RSI was 0x0. I also tested with simplest assembly application, but result was same. RDI and RSI was 0x0. I believe argc shouldn't be 0x0 even if I passed no arguments to program.

_start:
jmp $

Here's C code what I tried:

//Print Hello world WITHOUT standard library
//ONLY WITH SYSTEM CALL

#define ReadRdi(To) asm("movq %%rdi,%0" : "=r"(To));
#define ReadRsi(To) asm("movq %%rsi,%0" : "=r"(To));

void __Print(const char *str);
void __Exit();

void Test(const char *a){
    long aL;
    ReadRdi(aL);
    char *aC = (int) aL;
    __Print("Test argument: ");
    __Print(aC);
    __Print("\n");
}

void _start() {
    long argcL;
    long argvL;
    ReadRdi(argcL);
    ReadRsi(argvL);
    int argc = (int) argcL;
    char **argv = (char **) argvL;
    __Print("Arguments: ");
    for(int i = 0; i < argc; i ++) {
        __Print(argv[i]);
        __Print(", ");
    }
    __Print("\n");
    Test("Hello, world!");
    __Exit();
}

The result is:

Arguments:
Test argument: Hello, world!

I checked stack(Memory value in the RBP~RSP) using gdb, but it seemed there's nothing.

I tried changing

void _start() {
    long argcL;
    long argvL;
    ReadRdi(argcL);
    ReadRsi(argvL);
    int argc = (int) argcL;
    char **argv = (char **) argvL;
    __Print("Arguments: ");

to

void _start(int argc, char **argv) {
    __Print("Arguments: ");

but still I can't see any arguments output.

__Print prints message(using sys_write system call), __Exit exits program(using sys_exit system call).

Print.asm:

section     .text
global      __Print
__Print:
    mov rdx, 0
    push rdi
    jmp .count
.count:
    add rdx, 1
    add rdi, 1
    cmp byte[rdi], 0
    jne .count
.print:
    pop rdi
    mov     rcx, rdi                             ;message to write
    mov     rbx, 1                               ;file descriptor (stdout)
    mov     rax, 4                               ;system call number (sys_write)
    int     0x80                                ;call kernel
    ret

Exit.asm:

section     .text
global      __Exit
__Exit:
    mov rax,1                               ;system call number (sys_exit)
    int 0x80

ADD: I linked with this command:

gcc -o sysHello Exit.o Print.o sysHello.o -nostdlib -nodefaultlibs -g
Gippeumi
  • 251
  • 2
  • 18
  • 2
    Have you looked at [System V Application Binary Interface AMD64 Architecture Processor Supplement](http://www.x86-64.org/documentation/abi.pdf)? – Ian Abbott Feb 11 '16 at 11:05
  • 1
    Note that the _rdi, rsi_ convention does not apply to `_start` (the raw process entry point). Also, the 64 bit syscalls differ from 32 bit so all of those are wrong. Furthermore, by the time you read the registers in your inline asm, the compiler might have changed them, so don't rely on that behavior. – Jester Feb 11 '16 at 11:23
  • Identifiers starting with `_` followed by an uppercase letter or another underscore are reserved for the implementation. You must not use them in application code. – too honest for this site Feb 11 '16 at 11:38
  • If you overwrite the internal _start function that every program uses for initialisation, would you not break the control flow? Is this possible like that? – clockw0rk Oct 02 '19 at 13:15

1 Answers1

2

The easiest thing will be to write _start in asm, and have it call your C functions using the standard calling convention.

See the wiki for links to the ABI doc that describes where to find everything at process startup. (Or use the link in Ian's comment).

Writing _start as a C function would require inline asm, because there's no standard way to tell the compiler that argc is where the return address normally goes. So it's easier just to write it directly in asm and have it call main after putting args in registers for the normal calling convention.

A fragment of an asm program I use for testing things with perf counters:

cmp dword [rsp], 2
jg  addrmode_lat_3comp  ; argc > 2  ; $(seq 2)
jge addrmode_lat_1comp  ; argc >= 2 ; $(seq 1)
jmp loadlat_1comp       ; argc < 2  ; $(seq 0)

It jumps to one of three loops, depending on whether I run it with 2, 1, or no args, by testing argc. It uses NASM syntax.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847