0
section .text
    global _start       ;must be declared for using gcc
_start:                     ;tell linker entry point
    mov edx, len    ;message length
    mov ecx, msg    ;message to write
    mov ebx, 1      ;file descriptor (stdout)
    mov eax, 4      ;system call number (sys_write)
    int 0x80        ;call kernel
    mov eax, 1      ;system call number (sys_exit)
    int 0x80        ;call kernel

section .data

msg db  'Hello, world!',0xa ;our dear string
len equ $ - msg         ;length of our dear string

This is a basic 32-bit x86 Linux assembly code to print "Hello, World!" on the screen (standard output). Build + run it with

nasm -felf -g -Fdwarf hello.asm
gcc -g -m32 -nostdlib -static -o hello hello.o
./hello

(Editor's note: or gdb ./hello to debug / single-step it. That's why we used nasm -g -Fdwarf and gcc -g. Or use layout reg inside GDB for disassembly+register view that doesn't depend on debug symbols. See the bottom of https://stackoverflow.com/tags/x86/info)


Now I want to ask about how is this code working behind the scenes. Like what is the need for all these instructions

_start:                     ;tell linker entry point
        mov edx, len    ;message length
        mov ecx, msg    ;message to write
        mov ebx, 1      ;file descriptor (stdout)
        mov eax, 4      ;system call number (sys_write)
        int 0x80        ;call kernel
        mov eax, 1      ;system call number (sys_exit)
        int 0x80        ;call kernel

just to print "Hello, World!" and the statement

_start:

above! Is it the main function?

and the statement

int 0x80

why is it used at all? Can you guys give me a deep explaination of the basic working of this program.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
nikumbh tyagi
  • 39
  • 1
  • 6
  • 3
    It seems like you are lacking the fundamentals of Linux, or in general, assembly programming. A book or an online tutorial may be more appropriate :-) The only assembly specific aspect here is the privilege transfer (the system call) done with, the now obsolete 32-bit, `int 0x80`. The rest of the program just setups the parameters that, as a programmer, you would expect in any programming language. – Margaret Bloom Jul 12 '17 at 09:09
  • "In the beginning God created the heavens and the earth" ... where should we start? @MargaretBloom is fully correct, try fo tind a good tutorial – Tommylee2k Jul 12 '17 at 09:29
  • Please suggest me some? – nikumbh tyagi Jul 12 '17 at 11:52
  • 1
    *"Like what is the need for all these instructions just to print "Hello, World!"* ... actually this part does almost nothing, the meat of the hello world printing instructions is hidden in the OS in this case. If this would be some old 8 bit computer example creating "Hello, World" on the screen by directly writing values into video ram and setting up video chip to display them, then the code would involve many more instructions and lot more complex logic, plus you would need also the "font" pixel data for each letter, etc... This is like "Hey OS, print Hello world for me." work "effort". – Ped7g Jul 12 '17 at 16:31
  • Related: [Hello, world in assembly language with Linux system calls?](https://stackoverflow.com/q/61519222) for a fully commented "Hello World" using 32-bit int 0x80 system calls. Also [another answer](https://stackoverflow.com/a/39551489) using `puts` from libc instead of system calls directly. – Peter Cordes Jan 26 '22 at 03:16

1 Answers1

10

In machine code, there are no functions. At least, the processor knows nothing about functions. The programmer can structure his code as he likes. _start is something called a symbol which is just a name for a location in your program. Symbols are used to refer to locations whose address you don't know yet. They are resolved during linking. The symbol _start is used as the entry point (cf. this answer) which is where the operating system jumps to start your program. Unless you specify the entry point by some other way, every program must contain _start. The other symbols your program uses are msg, which is resolved by the linker to the address where the string Hello, world! resides and len which is the length of msg.

The rest of the program does the following things:

  1. Set up the registers for the system call write(1, msg, len). write has system call number 4 which is stored in eax to let the operating system know you want system call 4. This system call writes data to a file. The file descriptor number supplied is 1 which stands for standard output.
  2. Perform a system call using int $0x80. This instruction interrupts your program, the operating system picks this up and performs the function whose number is stored in eax. It's like a function call that calls into the OS kernel. The calling convention is different from other functions, with args passed in registers.
  3. Set up the registers for the system call _exit(?). Its system call number is 1 which goes into eax. Sadly, the code forgets to set the argument for _exit, which should be 0 to indicate success. Instead, whatever was in ebx before is used instead, which seems to be 1.
  4. Perform a system call using int $0x80. Because _exit ends the program, it does not return. Your program ends here.

The directive db tells the assembler to place the following data into the program where we currently are. This places the string Hello, world! followed by a newline into the program so we can tell the write system call to write that string.

The line len equ $ - msg tells the assembler than len is the difference between $ (where we currently are) and msg. This is defined so we can pass to write how long the text we want to print is.

Everything after a semicolon (;) in the program is a comment ignored by the assembler.

fuz
  • 88,405
  • 25
  • 200
  • 352
  • @PeterCordes Please don't just dump random question links into the answer. That looks super ugly. – fuz Apr 29 '19 at 21:12
  • I kind of like having the question title there in the question text, especially when it's a useful title that describes what the Q&A is about. But also just so I can recognize it as one I've seen before or not. A "see also ..." before it would have been less ugly, though, sorry. – Peter Cordes Apr 29 '19 at 21:15