I just wrote a C program that prints its command line argument without using the standard library or a main()
function. My motivation is simply curiosity and to understand how to play with inline assembly. I am using Ubuntu 17.10 x86_64 with the 4.13.0-39-generic kernel and GCC 7.2.0.
Below is my code which I have tried to comment as much as I understood. The functions print
, print_1
, my_exit
, and _start()
are required by the system to run the executable. Actually, without _start()
the linker will emit a warning and the program will segfault.
Functions print
and print_1
are different. The first one prints out a string to the console, measuring the length of the string internally. The second function needs the string length passed as an argument. The my_exit()
function just exits the program, returning the required value, which in my case is the string length or the number of command line arguments.
print_1
requires the string length as an argument so the characters are counted with a while()
loop and the length is stored in strLength
. In this case everything works pretty well.
Strange things happen when I use the print
function, which measures the string length internally. Simply speaking, it looks like this function somehow changes the string pointer to point to environment variables which should be the next pointer and instead of the first argument the function prints "CLUTTER_IM_MODULE=xim"
, which is my first environment variable. My workaround is to assign *a
to *b
in the next line.
I couldn't find any explanation inside the counting procedure, but it looks like it's changing my string pointer.
unsigned long long print(char * str){
unsigned long long ret;
__asm__(
"pushq %%rbx \n\t"
"pushq %%rcx \n\t" //RBX and RCX to the stack for further restoration
"movq %1, %%rdi \n\t" //pointer to string (char * str) into RDI for SCASB instruction
"movq %%rdi, %%rbx \n\t" //saving RDI in RBX for final substraction
"xor %%al, %%al \n\t" //zeroing AL for SCASB comparing
"movq $0xffffffff, %%rcx \n\t" //max string length for REPNE instruction
"repne scasb \n\t" //counting "loop" see details: https://www.felixcloutier.com/x86/index.html for REPNE and SCASB instructions
"sub %%rbx, %%rdi \n\t" //final substraction
"movq %%rdi, %%rdx \n\t" //string length for write syscall
"movq %%rdi, %0 \n\t" //string length into ret to return from print
"popq %%rcx \n\t"
"popq %%rbx \n\t" //RBX and RCX restoration
"movq $1, %%rax \n\t" //write - 1 for syscall
"movq $1, %%rdi \n\t" //destination pointer for string operations $1 - stdout
"movq %1, %%rsi \n\t" //source string pointer
"syscall \n\t"
: "=g"(ret)
: "g"(str)
);
return ret; }
void print_1(char * str, int l){
int ret = 0;
__asm__("movq $1, %%rax \n\t" //write - 1 for syscall
"movq $1, %%rdi \n\t" //destination pointer for string operations
"movq %1, %%rsi \n\t" //source pointer for string operations
"movl %2, %%edx \n\t" //string length
"syscall"
: "=g"(ret)
: "g"(str), "g" (l));}
void my_exit(unsigned long long ex){
int ret = 0;
__asm__("movq $60, %%rax\n\t" //syscall 60 - exit
"movq %1, %%rdi\n\t" //return value
"syscall\n\t"
"ret"
: "=g"(ret)
: "g"(ex)
);}
void _start(){
register int ac __asm__("%rsi"); // in absence of main() argc seems to be placed in rsi register
//int acp = ac;
unsigned long long strLength;
if(ac > 1){
register unsigned long long * arg __asm__("%rsp"); //argv array
char * a = (void*)*(arg + 7); //pointer to argv[1]
char * b = a; //work around for print function
/*version with print_1 and while() loop for counting
unsigned long long strLength = 0;
while(*(a + strLength)) strLength++;
print_1(a, strLength);
print_1("\n", 1);
*/
strLength = print(b);
print("\n");
}
//my_exit(acp); //echo $? prints argc
my_exit(strLength); //echo $? prints string length}