0

I am trying to write a homework assignment which is to:

write a simple Assembly program that all it does is call a C program, and send to it the command line arguments so that it may run properly with (argc and argv).

How can this be done? We were given this asm as part of the assignment:

section .text

  global _start
  extern main

_start:

  ;;code to setup argc and argv for C program's main()

  call    main

  mov eax,1
  int 0x80

So what I want to know is, where are argc and argv located? Also, do I just need to put the pointer to argc in the eax register like when returning a value to a regular C function and the C program will work the rest out?

In the end, after compiling my C program with the following Makefile (as I said, I am new to Assembly and this is the Makefile given to us by the teacher, I do not fully understand it):

%.o: %.asm
        nasm -g -O1 -f elf -o $@ $<

%.o: %.c
        gcc -m32 -g -nostdlib -fno-stack-protector -c -o $@ $<

all: lwca

lwca: lwc.o start.o
        ld  -melf_i386 -o $@ $^

Running ./lwca arg1 arg2 should result in argc = 3 and argv[1]=arg1 argc[2]=arg2

ANSWER: No answer quite solved my problem, in the end the what worked was:

pop    dword ecx    ; ecx = argc
mov    ebx,esp      ; ebx = argv
push   ebx   ; char** argv
push   ecx   ; int argc


call    main
Nivolas
  • 93
  • 2
  • 9
  • 3
    executing program and passing arguments is done on operating system level.. you can use `call` to a function from compiled C library, but not to run and execute another program. – Jacek Nov 24 '17 at 08:54
  • What values do you intend to provide as argc (number of commandline strings) and argv (array of pointers to char, representing the commandline strings)? Do you have any such? Would you like to simply state "there are none" instead? Please explain the environment, this looks like startup code for an embedded device, to be linked from reset vector. Does anything like a commandline which starts this even exist? – Yunnosch Nov 24 '17 at 08:55
  • @Rob: https://stackoverflow.com/questions/8863042/how-to-call-c-functions-from-assembly-routines-and-link-the-c-and-assembly-files is not a duplicate. This question wants to implement a minimal replacement for the CRT startup code that normally calls `main`. The one you linked is writing `main` in asm and calling other C functions. – Peter Cordes Nov 24 '17 at 09:04
  • Your title says x64 (i.e. x86-64), but then your question is building it as a 32-bit executable. Is the title wrong? – Peter Cordes Nov 24 '17 at 09:07
  • Why are you using `nasm -O1` ("minimal optimization") instead of the default multi-pass optimization? – Peter Cordes Nov 24 '17 at 09:08
  • BTW, I'd suggest you compile a Hello World in C normally, then `gdb ./hello`, and set a breakpoint at `_start` and single-step through gcc's startup code. It does significantly more work than you'd expect, mostly checking lists of constructors to see if anything needs to be run at startup before calling main. (In a simple hello world, they'll be empty.) `main()` is a regular function called the usual way (for 32-bit, with args on the stack). – Peter Cordes Nov 24 '17 at 09:11
  • @Yunnosch: From the `int 0x80`, we can tell this is Linux. The i386 System V ABI says that `argc` is pointed to by the stack pointer at process startup (i.e. on entry to `_start`), and `argv` starts one entry above that. (The array by value, not a pointer to it). – Peter Cordes Nov 24 '17 at 09:21
  • @Nivolas: How was your homework actually worded? Did it really say to call another *program*? Because that implies you should be making an `execve` system call to a separately-compiled executable, not replacing / re-implementing the CRT startup code that calls `main`. – Peter Cordes Nov 24 '17 at 09:22
  • @PeterCordes so this means that a pointer to argc is in `esp` and that `argv` starts at `esp+4`? The homework was stated as I have written, the Assembly code i have entered was also given to me. – Nivolas Nov 24 '17 at 09:23
  • Yes, in a 32-bit executable. See the ABI doc: https://github.com/hjl-tools/x86-psABI/wiki/intel386-psABI-1.1.pdf which describes the process startup state. It's the same as in the linked duplicate, except that's for x86-64 where everything on the stack is 8 bytes wide instead of 4. – Peter Cordes Nov 24 '17 at 09:24
  • @PeterCordes Pushing `[esp]` and `[esp+4]` onto the stack and then calling main does not work. Any ideas? – Nivolas Nov 24 '17 at 11:16
  • `push` modifies `esp`. Also, `main` wants a *pointer to* `argv`, not the first element. The array itself is right there on the stack. `lea eax, [esp+4]` / `push eax` / `push [esp]` might work, if I got everything right. Oh, `esp` should be 16B-aligned before you call `main`. Depending on the program, it might segfault if you don't. So you should `sub esp, 8` before pushing 2 more args, because the ABI does guarantee that `esp` is 16B-aligned at process startup. Also, the normal CRT startup passes `envp` to main. I assume your `main` doesn't look for a 3rd `envp` arg? – Peter Cordes Nov 24 '17 at 11:30
  • Since this is actually a 32-bit question, it's not an exact duplicate of https://stackoverflow.com/questions/35864291/get-argv2-address-in-assembler-x64, but it is still highly related. – Peter Cordes Nov 24 '17 at 11:39
  • @PeterCordes Nope, it doesn't look for the third argument. What you have suggested does not work. Maybe I am misunderstanding how I should reset `esp`? after every push, I need to perform `sub esp, 4` so that in the end it stays where it started? – Nivolas Nov 24 '17 at 12:14
  • No, that would skip an extra 4 bytes after every push. Remember that `push [esp]` is `tmp = [esp]` / `esp -= 4` / `[esp] = tmp`, so doing one push changes the offset needed to reach other things on the stack by `+4`. – Peter Cordes Nov 24 '17 at 22:26
  • I think I had a bug in what I suggested first: I pushed two copies of `argv`, because I used the wrong offset for the 2nd push. `lea eax, [esp+4]` / `push eax` / `push [eax-4]` might work, but **use a debugger** to see what you're pushing and where `esp` is pointing at each step. (Notice that I used `[eax-4]` instead of `[esp+4]` for efficiency reasons, and because it's easier to get it right: we used an `lea` to set `eax` before we started modifying `esp`.) – Peter Cordes Nov 24 '17 at 22:30
  • And BTW, I found a 32-bit duplicate for your question with a much more detailed answer (I edited the duplicate earlier). https://stackoverflow.com/questions/16721164/x86-linux-assembler-get-program-parameters-from-start. Go read it, it shows you how to look at memory with GDB. – Peter Cordes Nov 24 '17 at 22:37
  • I was curious, so I tried the code from my last comment. It works too. (The 2nd `push` needs `push dword [eax-4]` to specify the operand-size, but all the offsets are correct. I tested it with a simple `main` that uses `printf`, so I had to link with `gcc -m32 crt-replacement.o print-args.c -nostdlib -lc`). Anyway, your code is equivalent, and nice and compact. (But you could save the `mov` instruction by using `push esp`, which [is defined](http://felixcloutier.com/x86/PUSH.html) to push the value of `esp` *before* push decrements it.). – Peter Cordes Nov 26 '17 at 00:22
  • BTW, after `call main` you (or your professor) should use `mov ebx, eax` to pass `main`'s return value to `sys_exit` instead of having exit_status = whatever was left in `ebx`. (Which is actually `0` if you don't modify it before calling main, because `ebx` is call-preserved and the Linux inits it to 0. But your version exits with status = low byte of `esp`...) – Peter Cordes Nov 26 '17 at 00:27
  • Anyway, this replacement for CRT doesn't work in general for arbitrary C programs. Language features like `atexit` (register a function to be called after `main` or on `exit()`) depends on CRT code to check that (related: [exit functions / syscalls](https://stackoverflow.com/questions/46903180/syscall-implementation-of-exit/46903734#46903734). It also doesn't initialize libc if statically linking. (But that Makefile doesn't link libc in the first place.) Anyway, just my 2 cents that this minimal startup code doesn't do everything that the compiler-provided version does. – Peter Cordes Nov 26 '17 at 00:29

0 Answers0