This is bad. You might have noticed that the function prologue push %rbp
mov %rsp,%rbp
is emitted by the compiler for function _start
:
400292: 55 push %rbp
400293: 48 89 e5 mov %rsp,%rbp
If you are going to do this then consider at least compiling with -fomit-frame-pointer
. With the function prologue pushing RBP, when you pop RCX you aren't placing the number of command line arguments into RCX, you are putting the value of RBP (which is now at top of stack) into RCX. Of course this cascades to your other stack operations working on the wrong values.
Rather than omitting the stack frame as my first suggestion, you could have coded the _start
function directly like this:
asm ( ".global _start;" /* Make start symbol globally visible */
"_start:;"
"pop %rcx;" /* Contains argc */
"cmp $2, %rcx;" /* If argc = 2 (argv[0 & argv[1] exist) */
"jne exit;" /* If it's not 2, exit */
"add $8, %rsp;" /* Move stack pointer to argv[1] */
"pop %rdi;" /* Pop off stack */
"mov $85, %rax;" /* #define __NR_creat 85 */
"mov $0x2E8, %rsi;" /* move 744 to rsi */
"syscall;"
"exit:;"
"mov $60, %rax;" /* sys_exit */
"mov $2, %rdi;"
"syscall"
);
Since the normal process of declaring a C++ function has been bypassed we don't need to worry about the compiler adding prologue and epilogue code.
The file mode bits you use for sys_creat
are incorrect. You have:
"mov $0x2E8, %rsi;" /* move 744 to rsi */
0x2E8 = 744 decimal. I believe your intention was to put 744 octal into %RSI. 744 octal is 0x1e4. To make it more readable you can use octal values in GAS by prepending the value with a 0. This would have been what you were looking for:
"mov $0744, %rsi;" /* File mode octal 744 (rwxr--r--) */
Rather than:
"pop %rsi;" /* Pop off stack */
"mov %rsi, %rdi;" /* Move argv[1] to rdi */
You could have popped directly into %rdi
:
"pop %rdi;" /* Pop off stack */
You could have also kept the parameters on the stack in place and directly accessed them this way:
asm ( ".global _start;" /* Make start symbol globally visible */
"_start:;"
"cmp $2, (%rsp);" /* If argc = 2 (argv[0 & argv[1] exist) */
"jne exit;" /* If it's not 2, exit */
"mov 16(%rsp), %rdi;" /* Get pointer to argv[1] */
"mov $85, %eax;" /* #define __NR_creat 85 */
"mov $0744, %esi;" /* File mode octal 744 (rwxr--r--) */
"syscall;"
"exit:;"
"mov $60, %eax;" /* sys_exit */
"mov $1, %edi;"
"syscall"
);
In this last code snippet I've also changed to using 32-bit registers in some instances. You can take advantage of the fact that in x86-64 code, putting a value into a 32-bit register automatically zero extends the value into the high 32-bits of the 64-bit register. This can save a couple of bytes on the instruction encoding.
Accessing Command Line Parameters via main w/64-bit Code
If you compile using the C/C++ runtime, the runtime will supply a label _start
that does program startup, modifies the command line parameters passed by the OS to suit the 64-bit System V ABI. Parameter passing is discussed in section 3.2.3. In particular the first two parameters to main
in 64-bit code are passed via RDI and RSI. RDI will contain the value argc
and RSI will contain a pointer to an array of char *
pointers. Since these parameters are not passed via the stack we don't need to concern ourselves with any function prologue and epilogue code.
int main(int argc, char *argv[])
{
asm ( "cmp $2, %rdi;" /* If argc = 2 (argv[0 & argv[1] exist) */
"jne exit;" /* If it's not 2, exit */
/* _RSI_ (second arg to main) is a pointer
to an array of character pointers */
"mov 8(%rsi), %rdi;"/* Get pointer to second char * pointer in argv[] */
"mov $85, %eax;" /* #define __NR_creat 85 */
"mov $0744, %esi;" /* File mode octal 744 (rwxr--r--) */
"syscall;"
"exit:;"
"mov $60, %eax;" /* sys_exit */
"mov $1, %edi;"
"syscall"
);
}
You should be able to compile this with:
g++ -o testargs testargs.c -g
Special note: If you intend to eventually use inline assembly along with C/C++ code you are going to have to learn about GCC extended assembler templates, constraints, clobbering, etc. That is beyond the scope of this question. Learning assembler is much more difficult if you use inline assembly as compared with creating separate assembly code objects and call them from C/C++. It is very easy to use GCC's extended inline assembly improperly. Code may seem to work at first, but subtle bugs can creep in as the program gets more complex.