0

I'm trying to learn system calls in assembly from this page on tutorialspoint.

On that page, there is an assembly code that reads user input and prompts it out. Which looks confusing to me and doesn't work unfortunately:

section .data                           ;Data segment
   userMsg db 'Please enter a number: ' ;Ask the user to enter a number
   lenUserMsg equ $-userMsg             ;The length of the message
   dispMsg db 'You have entered: '
   lenDispMsg equ $-dispMsg                 

section .bss           ;Uninitialized data
   num resb 5

section .text          ;Code Segment
   global _start

_start:                ;User prompt
   mov eax, 4
   mov ebx, 1
   mov ecx, userMsg
   mov edx, lenUserMsg
   int 80h

   ;Read and store the user input
   mov eax, 3
   mov ebx, 2
   mov ecx, num  
   mov edx, 5          ;5 bytes (numeric, 1 for sign) of that information
   int 80h

   ;Output the message 'The entered number is: '
   mov eax, 4
   mov ebx, 1
   mov ecx, dispMsg
   mov edx, lenDispMsg
   int 80h  

   ;Output the number entered
   mov eax, 4
   mov ebx, 1
   mov ecx, num
   mov edx, 5
   int 80h  

   ; Exit code
   mov eax, 1
   mov ebx, 0
   int 80h

Once the code is executed, the program never asks for input - it calls system exit immediately.


I find certain parts of the code confusing, and assume that they might have to do something with failure:

   ;Read and store the user input
   mov eax, 3
   mov ebx, 2
   mov ecx, num  
   mov edx, 5          ;5 bytes (numeric, 1 for sign) of that information
   int 80h

On the code above, for eax (32-bit accumulator register) it makes sense to be 3, since it performs sys_read system call. edx probably defined data type, and considering that we are saving integer, 5 makes sense.

But 32-bit base register should contain file descriptor index (where stdin=0, stdout=1, stderr=2). But why is ebx=2 in the code above?


Apologies if the question is too simple, but why wouldn't the code work? Is there something wrong with incorrect choices of inputs in registers? i.e what I mentioned above.

ShellRox
  • 2,532
  • 6
  • 42
  • 90
  • 1
    Unless you've redirected `stderr`, reading from it probably works the same as if reading from `stdin`. – Michael Sep 11 '18 at 19:39
  • 3
    stderr can be used as input stream: https://stackoverflow.com/a/51308591/3512216 – rkhb Sep 11 '18 at 19:42
  • 2
    The program works but you have to assemble and [link it as 32-bit executable](https://stackoverflow.com/questions/16004206/force-gnu-linker-to-generate-32-bit-elf-executables). You can read the first part of the tutorial I guess but then move on to the calling convention of 64-bit syscalls, get the list of syscalls and use them like you would in C. The section 2 of the manual document the wrappers around the syscalls, googling for the name of the syscall will also give some doc. In the worst there is the kernel source. – Margaret Bloom Sep 11 '18 at 19:42
  • 1
    @MargaretBloom: My Debian Jessie accepts and runs this program - as is - in both 32-bit and 64-bit mode. – rkhb Sep 11 '18 at 19:51
  • 3
    @rkhb That's not strange it also does in my CentOS but it depends on the default switches your distro configured GCC with. See [this great answer](https://stackoverflow.com/a/46087731/5801661). I guessed that may be the OP problem. – Margaret Bloom Sep 11 '18 at 19:54
  • @rkhb as always in assembly, the fact that "the program did run as expected" doesn't say much about its correctness (it is certainly closer to correct code, than a program which doesn't produce correct observable behaviour, but it may be still long way from correct). – Ped7g Sep 11 '18 at 19:57
  • @Margaret Bloom Is there something wrong with kernel interruption? I've heard that 0x80 is for kernel in every linux distribution. I've performed this code on multiple online nasm assemblers, but all of them gave same results. Page linked above also has link for online assembly of their code (which should have optimal settings for that certain code), but it still doesn't work. – ShellRox Sep 11 '18 at 20:09
  • @Michael But what's the point for using stderr index as input descriptor when there is stdin? Is there some advantage? – ShellRox Sep 11 '18 at 20:11
  • 2
    @ShellRox `int 0x80` is the old 32-bit syscall interface. 64-bit uses `syscall`. Take a look at the linked answer I gave to rkhb ;) – Margaret Bloom Sep 11 '18 at 20:34
  • @MargaretBloom Apologies for confusion, I've tried replacing `int 0x80` with `syscall` as experiment, but it didn't work out quite well, there was illegal instruction error. From the last link, I've understood that `int 0x80` will work unless data in the pointers is larger than 32 bit. – ShellRox Sep 11 '18 at 20:59
  • 1
    @ShellRox `syscall` is a different thing, it's better to find a tutorial about it. Exactly! `int 0x80` will work if all the pointers fit in 32-bit :) But you have to make sure this is the case. – Margaret Bloom Sep 11 '18 at 21:42
  • 1
    @MargaretBloom: why is anyone talking about porting to x86-64 here? This is a valid program for i386 Linux, correctly using the legacy `int 0x80` 32-bit ABI. It would also happen to work if built as a 64-bit static executable, because the default code model puts static symbols in the low 32 bits of address space. It would only fail if built as a 64-bit PIE executable (which would explain the observed symptoms), or on kernels without CONFIG_IA32_EMULATION, in which case it would segfault or something instead of exiting cleanly. – Peter Cordes Sep 11 '18 at 23:18
  • 1
    Of course it's a pretty crappy program. It reads from `stderr` for no reason, and it dumps the whole 5-byte buffer instead of saving the return value from `read(2)` to use as the length for `write(2)`. – Peter Cordes Sep 11 '18 at 23:19
  • 2
    @PeterCordes because the OP's goal is to learn how to write Linux program in assembly. I suggested to either assemble/link it as 32-bit or move on to 64-bit since the tutorial they are following is outdated and don't take these problems into account. – Margaret Bloom Sep 12 '18 at 06:05
  • I'd rather to move to another tutorial since code is outdated and little weird as well. Thank you for the help! – ShellRox Sep 12 '18 at 07:22
  • 1
    @ShellRox about `int 0x80` working everywhere... for example the embedded linux inside Windows 10 has 64b only kernel, so 64b binary using `int 0x80` will fail there, even if it works on common 64b Ubuntu install (and that linux inside win10 is based on Ubuntu, so one may expect it to work in similar way, but it does not). Of course ordinary 32 bit binary will not work at all in such 64b-only system, but the error message would be probably more to the point, then generic segfault. – Ped7g Sep 12 '18 at 12:25

0 Answers0