1

Recently I have been learning NASM x86 Assembly. And in writing my code, whenever I use int 21h or int 80h or any interrupt for that matter it returns a segmentation fault. The entire program is:

section .data
    msg: dw 'Hello World!', 10
section .text
    global _main

_main:
    mov eax, 4
    mov ebx, 1
    mov ecx, msg
    int 80h

Any help or tips on where I can learn more about NASM is highly, higjly appreciated.

Alex Dukhan
  • 61
  • 1
  • 8
  • 4
    What operating system are you programming for and on? What architecture are you programming for? Note that system calls only work on the operating system they have been designed for, so using `int 21h` won't work on Linux and using `int 80h` won't work on DOS. – fuz Apr 11 '18 at 20:48
  • 5
    Also, there is no code in your program after the write system call, so it's going to crash even if the system call succeeds. – fuz Apr 11 '18 at 20:49
  • @fuz I have run both om both. I have a linux virtual machine running on a win 10 base os. and i have run both 21h and 80h on both operating systems and they both threw a segmentation fault or simply crashed. Also, you are right about the crashing bit, in that i just re-ran the code with a `syscall` at the end and it just finished with a segmentation fault – Alex Dukhan Apr 11 '18 at 20:56
  • 1
    If you are using the "linux in windows" of win10 (not sure how it is called exactly), then that kernel is only 64 bit, so `int 0x80` will not work, you will have to build elf64 binary and use proper 64b system ABI to call services (with `syscall` instruction) ... to have 32b compatibility kernel, you will have to install full linux into some VM. – Ped7g Apr 11 '18 at 21:01
  • 1
    @AlexDukhan You can't just add a `syscall` instruction to the end and expect that to work. Linux system calls using the `syscall` instructions are for amd64 and work differently. I advise you to stick to a single tutorial and run your code on actual Linux, not WSL. You seem to fall into the trap of trying to copy code from multiple tutorials without understanding what it actually does and then getting frustrated when it won't work. – fuz Apr 11 '18 at 21:24
  • @fuz I get that and i tried that, I even duel booted into linux on my pc and followed one tutorial, the problem is that every tutorial i have run into hasn't worked. If you have any recommendations, i would love to see them – Alex Dukhan Apr 11 '18 at 21:28
  • 2
    @AlexDukhan Please add the exact program you tried and the exact commands you typed to assemble and run the program to your question—the program you just posted won't work regardless of how you assemble it because it doesn't actually contain code to terminate, so it's just going to crash after the write system call. Also, `_main` is the wrong entry point when writing Linux programs. There are so many ways you can do things wrong, so it's important for you to be specific or it's really hard to help you. – fuz Apr 11 '18 at 21:50
  • @fuz ok, so i get that this is terrible code, that is more than evident. but do you have a tutorial or something that i could follow to learn from my mistakes. either masm or nasm will work – Alex Dukhan Apr 11 '18 at 21:54
  • @AlexDukhan I can recommend *Assembly Language Step by Step* by *Jeff Duntemann.* – fuz Apr 11 '18 at 22:42
  • @fuz thank you, I'll follow that as best I can. Also, if you could close this post, that would be very helpful. Thank you for all the help. – Alex Dukhan Apr 11 '18 at 22:46
  • There are some links in [the x86 tag wiki](https://stackoverflow.com/tags/x86/info), including [Programming from the Ground Up](https://savannah.nongnu.org/projects/pgubook/) (a free book) which is pretty good from the parts I've skimmed. It's for i386 (32-bit) Linux, and gets into details of how system calls work as well as just asm. – Peter Cordes Apr 12 '18 at 12:08
  • The 2nd duplicate link I added to the question has a complete and fairly detailed explanation of a working 32-bit Linux Hello World implementation. – Peter Cordes Apr 12 '18 at 12:15

1 Answers1

4

whenever I use int 21h or int 80h

The int instruction is a special variant of a call instruction which is calling some function in the operating system.

This means of course that the int instruction behaves differently in different operating systems:

int 21h

Interrupt 21h was used in MS-DOS and 16-bit Windows (Windows 3.x). Therefore this instruction could be used in MS-DOS and 16-bit Windows programs only.

The interrupt is not supported in 32-bit (or 64-bit) Windows programs. Linux does also not support this interrupt.

int 80h

This interrupt is supported in 32-bit Linux programs. 64-bit Linux versions can run 32-bit Linux programs (but you'll have to ensure that the program you are creating really is a 32-bit program and not a 64-bit program).

other interrupts (such as int 10h)

... are neither supported by Linux nor by recent Windows versions. (They were supported in 16-bit Windows.)

int 80h ... it returns a segmentation fault.

Under Linux you may run the strace command to see what is happening with the int 80h system call.

I did this with your program and got the following output:

$ strace ./x.x
execve("./x.x", ["./x.x"], [/* 54 vars */]) = 0
strace: [ Process PID=3789 runs in 32 bit mode. ]
write(1, "", 0)                         = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---

You can see that int 80h does not generate a fault but it is executed correctly.

However the edx register has the value 0. Therefore int 80h will output the first 0 bytes (= nothing) of your "Hello World".

You'll have to add the instruction mov edx, 13 before the int 80h instruction.

The segmentation fault happens later!

As a beginner of assembly language you should first realize what assembler is: Each assembler instruction represents some bytes in RAM memory.

The instruction mov eax, 4 for example represents the bytes 184, 4, 0, 0, 0 or the instruction int 80h represents the bytes 205, 128.

Your assembler program ends after the instruction int 80h. However the RAM memory does of course not end after the bytes 205, 128. The RAM memory will contain random data after the bytes 205, 128.

Maybe the bytes in RAM found after that bytes are 160, 0, 0, 0, 0 which equals mov al, [0]. This would cause a segmentation fault.

You'll have to add some instructions after the int 80h instructions that will stop your program. Otherwise the CPU will interpret the bytes in RAM following the int 80h instruction as instructions and execute them...

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • I closed this as a duplicate of another fall-off-the-end-of-`_start` question, but IDK if I would have if you hadn't already posted this good answer. Maybe that means I shouldn't have, but I think it's just a bogus question where there are 2 totally different things going on: that bug + the DOS question. – Peter Cordes Apr 12 '18 at 12:11
  • Thank you for all of the help. I'll try to use it to the best of my ability. Thanks again! – Alex Dukhan Apr 12 '18 at 16:29