1

I am trying to learn the assembly language from a Linux Ubuntu 16.04 x64. For now I have the following problem: - scan an integer n and print the numbers from 1 to n.

For n = 5 I should have 1 2 3 4 5. I tried to do it with scanf and printf but after I input the number, it exits.

The code is:

;nasm -felf64 code.asm && gcc code.o && ./a.out

SECTION .data
    message1: db "Enter the number: ",0
    message1Len: equ $-message1
    message2: db "The numbers are:", 0
    formatin: db "%d",0
    formatout: db "%d",10,0 ; newline, nul
    integer: times 4 db 0 ; 32-bits integer = 4 bytes

SECTION .text
    global main
    extern scanf
    extern printf

main:

    mov eax, 4
    mov ebx, 1
    mov ecx, message1
    mov edx, message1Len
    int 80h

    mov rdi, formatin
    mov rsi, integer
    mov al, 0
    call scanf
    int 80h

    mov rax, integer
    loop:
        push rax
        push formatout
        call printf
        add esp, 8
        dec rax
    jnz loop

    mov rax,0

ret

I am aware that in this loop I would have the inverse output (5 4 3 2 1 0), but I did not know how to set the condition.

The command I'm using is the following:

nasm -felf64 code.asm && gcc code.o && ./a.out

Can you please help me find where I'm going wrong?

lidia901
  • 39
  • 7
  • Why do you `push` arguments for `printf` to the stack? What source of information did you use to do it like this? (I have a suspicion you are changing 32b code/tutorial into 64b, but that will not work so simply, it's more complicated... for the moment, if you have good 32b asm resource for learning, it would be much easier to teach you how to build 32b binary under 64b linux and work with that instead). ... either that, or your 64b resource is of low quality, and try some better one... – Ped7g Nov 19 '17 at 21:33
  • @Ped7g Probably I confused 32b with 64b... I modified everything but I still have problems with the printf function in a loop... and I can't find documentation for it. It's harder than I expected. :D – lidia901 Nov 19 '17 at 22:08
  • Can you edit your question, and show 32b variant of source + command line how you build it? `printf` is one of the more complex ones to get right, as it has variable amount of arguments, so you need to know the proper calling convention beyond basics. You may also check this tutorial (looks to target exactly nasm+libc+32b and looks to be well commented): https://www.csee.umbc.edu/portal/help/nasm/sample.shtml And if you are just starting with assembly, I would skip calling libc functions completely, and toy around with pure x86 instructions (doing some math), checking values in debugger only. – Ped7g Nov 19 '17 at 22:21
  • 1
    And there's are also 64b sample link, but again, if you are just starting with assembly, I would suggest to stick to 32b (as long as calling libc is involved, for pure x86-64 asm without external calls - the 64b is only tiny bit more complicated, it's the calling convention itself which is lot more tricky than 32b, you have to keep also stack aligned ahead of each call, and there's the "red zone" feature, etc..). But being able to use debugger to single step over instructions and verify the state of registers/flags/memory is essential, lot more important, than calling printf. – Ped7g Nov 19 '17 at 22:25
  • About *"harder than I expected"* - well, in assembly you have full control over machine, so you can tell it to do anything, what it is capable of. Which means, that for every legal+wanted action you have about dozen of valid ways how to write it, and thousands of invalid ways, which did look as good idea when you wrote them. You need to learn to be absolutely precise in every step, from formulating what you want to achieve, how you want to achieve it, and why each instruction in code belongs there, then you need to learn to re-read that and compare it with reality check in debugger, then fix. – Ped7g Nov 19 '17 at 22:30
  • @Ped7g does it matter that my OS type is for 64bits? I thought so and that's why I started with the 64b programs. – lidia901 Nov 19 '17 at 22:32
  • Well.. about editing the question and showing 32b variant of source.. I can't anymore cause I modified like everything and I don't find it anymore. But I can show the actual state of my program if it helps – lidia901 Nov 19 '17 at 22:35
  • Yes, it may, but 64b linux is usually capable to run 32b binaries with 32b compatibility layer (the 64b ubuntu "WSL" packed in windows 10 is NOT capable to run 32b binaries), so ordinary 64b linux installation is very likely ready to produce+run+debug 32b binaries (or you will have to install just few more packages to have 32b support in gcc/etc). The ordinary "Linux Ubuntu 16.04 x64" definitely can be set up to work with 32b (I'm myself on KDE neon distribution based on 16.04, verifying nasm Q+A for both 32b and 64b asm easily, using edb-debugger built from source from github). – Ped7g Nov 19 '17 at 22:35
  • https://stackoverflow.com/a/36901649/4271923 (hmm, that's actually too much gcc+as oriented, but searching along these lines "nasm 32b linux 64b" should give you something more nasm focused in few links) – Ped7g Nov 19 '17 at 22:38
  • Thank you so much, @Ped7g! You are very kind! :) I will try to learn first on 32bits. – lidia901 Nov 19 '17 at 22:40
  • 2
    about learning 32b first -> don't worry, in terms of the pure x86 instructions, the step from 32b to 64b is not huge (some more registers, some registers not available, some special rules about 32b reg usage, that's almost all). It's just the calling convention on 64b system is much better (in terms of performance) and complex, it's a bit more difficult to follow for humans (that was not important when designing it, as 99% of code is produced by compilers, while performance was important). – Ped7g Nov 19 '17 at 22:45
  • Now I recall I did add to some recent answer a fully working example for nasm 32b mixing with clib `printf`: https://stackoverflow.com/questions/47362660/solving-pascals-triangle-nck-in-assembly-recursion/47366155#47366155 ... feel free to ask there if anything is not clear or working for you. (about the lengthy command lines ... looks probably tedious, but I'm using Kate text editor with build-methods setup, so I don't mind those lengthy names... then again you can also store those commands in shell script or even make file). Sadly I didn't bother to add 64b variant. – Ped7g Nov 19 '17 at 22:50
  • And I recall it in wrong way, it was other way around, the C++ code calling the assembler, and I even tried the "fastcall" convention intentionally... so it's not about calling `printf` from assembly.. sorry :D.. still, there are many tutorials on the Internet, and I'm too tired to write full answer here. – Ped7g Nov 19 '17 at 22:52
  • @Ped7g: My answer on https://stackoverflow.com/a/36901649/4271923 that you linked earlier does have a NASM section. But the OP is using 64-bit registers and calling convention, so maybe the real mistake is using `int 0x80` in 64-bit mode (https://stackoverflow.com/questions/46087730/what-happens-if-you-use-the-32-bit-int-0x80-linux-abi-in-64-bit-code). Well really the problem is not deciding whether to use system calls or stdio library functions. And worse, using `call scanf` / `int 80h` so the syscall number is determined by the scanf return value!!!! – Peter Cordes Nov 20 '17 at 00:14

2 Answers2

3

There are several problems:
1. The parameters to printf, as discussed in the comments. In x86-64, the first few parameters are passed in registers.
2. printf does not preserve the value of eax.
3. The stack is misaligned.
4. rbx is used without saving the caller's value.
5. The address of integer is being loaded instead of its value.
6. Since printf is a varargs function, eax needs to be set to 0 before the call.
7. Spurious int 80h after the call to scanf.

I'll repeat the entire function in order to show the necessary changes in context.

main:
    push rbx           ; This fixes problems 3 and 4.

    mov eax, 4
    mov ebx, 1
    mov ecx, message1
    mov edx, message1Len
    int 80h

    mov rdi, formatin
    mov rsi, integer
    mov al, 0
    call scanf

    mov ebx, [integer] ; fix problems 2 and 5
    loop:
        mov rdi, formatout   ; fix problem 1
        mov esi, ebx
        xor eax, eax   ; fix problem 6
        call printf
        dec ebx
    jnz loop

    pop rbx            ; restore caller's value
    mov rax,0

ret

P.S. To make it count up instead of down, change the loop like this:

    mov ebx, 1
    loop:
        <call printf>
        inc ebx
        cmp ebx, [integer]
    jle loop
prl
  • 11,716
  • 2
  • 13
  • 31
  • The 32-bit `int 80h` ABI for sys_write in 64-bit code isn't technically wrong, but the 64-bit `syscall` would be much better. (Also, you didn't mention problem 0 which was actually making the program exit: `int 80h` with `eax` = scanf's return value = 1 = __NR_exit. (see my answer). – Peter Cordes Nov 20 '17 at 00:38
  • @PeterCordes another problem is mixing clib IO functions together with sys_write ... I mean, I was too tired to produce a complete fix, as there went so much wrong, so I rather tried just to propose smaller steps for the beginning. (actually I'm afraid there's so much in the fix changed, that it will be hard to comprehend at all, without taking those smaller steps first) – Ped7g Nov 20 '17 at 01:15
  • @Ped7g: it's actually safe if you `sys_write` *before* using any stdio library functions that might buffer the I/O instead of doing it before returning. But yes, it's definitely something to warn against. – Peter Cordes Nov 20 '17 at 01:41
  • @prl I think it should've been jge loop at the end, right? Also, thank you so much! – lidia901 Nov 20 '17 at 08:31
  • Yes, I wrote the compare backwards. Hazard of being forced to read AT&T syntax so often. I'll fix it by reversing the compare, rather than the conditional branch. – prl Nov 20 '17 at 16:11
1

You are calling scanf correctly, using the x86-64 System V calling convention. It leaves its return value in eax. After successful conversion of one operand (%d), it returns with eax = 1.

... correct setup for scanf, including zeroing AL.

call scanf    ; correct
int 80h       ; insane: system call with eax = scanf return value

Then you run int 80h, which makes a 32-bit legacy-ABI system call using eax=1 as the code to determine which system call. (see What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?).

eax=1 / int 80h is sys_exit on Linux. (unistd_32.h has __NR_exit = 1). Use a debugger; that would have shown you which instruction was making your program exit.

Your title (before I corrected it) said you got a segmentation fault, but I tested on my x86-64 desktop and that's not the case. It exits cleanly using an int 80h exit system call. (But in code that does segfault, use a debugger to find out which instruction.) strace decodes int 0x80 system calls incorrectly in 64-bit processes, using the 64-bit syscall call numbers from unistd_64.h, not the 32-bit unistd_32.h call numbers.


Your code was close to working: you use the int 0x80 32-bit ABI correctly for sys_write, and only pass it 32-bit args. (The pointer args fit in 32 bits because static code/data is always placed in the low 2GiB of virtual address space in the default code model on x86-64. Exactly for this reason, so you can use compact instructions like mov edi, formatin to put addresses in registers, or use them as immediates or rel32 signed displacements.)

OTOH I think you were doing that for the wrong reason. And as @prl points out, you forgot to maintain 16-byte stack alignment.

Also, mixing system calls with C stdio functions is usually a bad idea. Stdio uses internal buffers instead of always making a system call on every function call, so things can appear out of order, or a read can be waiting for user input when there's already data in the stdio buffer for stdin.


Your loop is broken in several ways, too. You seem to be trying to call printf with the 32-bit calling convention (args on the stack).

Even in 32-bit code, this is broken, because printf's return vale is in eax. So your loop is infinite, because printf returns the number of characters printed. That's at least two from the %d\n format string, so dec rax / jnz will always jump.

In the x86-64 SysV ABI, you need to zero al before calling printf (with xor eax,eax), if you didn't pass any FP args in XMM registers. You also have to pass args in rdi, rsi, ..., like for scanf.

You also add rsp, 8 after pushing two 8-byte values, so the stack grows forever. (But you never return, so the eventual segfault will be on stack overflow, not on trying to return with rsp not pointing to the return address.)


Decide whether you're making 32-bit or 64-bit code, and only copy/paste from examples for the mode and OS you're targeting. (Note that 64-bit code can and often does use mostly 32-bit registers, though.)

See also Assembling 32-bit binaries on a 64-bit system (GNU toolchain) (which does include a NASM section with a handy asm-link script that assembles and links into a static binary). But since you're writing main instead of _start and are using libc functions, you should just link with gcc -m32 (if you decide to use 32-bit code instead of replacing the 32-bit parts of your program with 64-bit function-calling and system-call conventions).

See What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847