0

I wanted to understand how does a single character input work in x86 assembly, the following snippet of a program does the necessary input, but I would like a deeper explanation of how exactly it works. ( This is linux environment, hence interrupt code is 128 )

sys_exit equ 1
sys_read equ 3
sys_write equ 4
stdin equ 0
stdout equ 1

segment .bss
  num1 resb 2
  num2 resb 2
  res resb 1

...

segment .text
  mov eax,std_read
  mov ebx,stdin
  mov ecx,num1
  mov edx,2
  int 128

I don't exactly understand how the input mechanism work here ( of course abstractly ) for example, I imagine it to be,

first eax is loaded with read system call code, then std input file descriptor is loaded into ebx.

now this is the part that I don't understand num1 is afterall an address how does loading it into ecx will receive the input from std input device? what buffering does mov ebx,stdin do if any? and how does the system know at this line, at which address the input character must be loaded, my best guess is it must have a pointer relative to .bss section and it keeps receiving the input. but then it gives rise to another question that how do we know if uninitialized data is stored sequentially, Is it? please help me understand.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Pawan Nirpal
  • 565
  • 1
  • 12
  • 29
  • 1
    It's a Linux `read` system call. It runs code inside the kernel. See [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/a/46087731) which covers some details of what happens inside the kernel as part of answering that question. But just generically about dispatching to the handler. For `read` specifically, the Linux kernel eventually runs the `copy_to_user` if the user-space address is valid, and the `fd` is an open file descriptor on a readable file (including a character device file like a TTY). – Peter Cordes May 19 '22 at 20:00
  • 1
    Some more details: nothing happens until you get to the `int 128` which you can treat like a function call to the OS. `mov ebx,stdin` does not do any buffering, it loads a value (zero in this case) into `ebx`. The OS, when it gets control, will look at that value to know which file to read from. Yes, stuff you allocate like in your code are sequential. The pointer you pass in `ecx` is absolute address, it's not relative to `.bss`. The whole sequence is abstractly equivalent to `read(stdin, &num1, 2)` with the function call being the `int 128` and the rest are just loading the arguments. – Jester May 19 '22 at 21:59
  • 1
    Do you know how `read(0, num1, 2);` works in C? – that other guy May 19 '22 at 22:01
  • Yes, I am getting the sense of it now, thank you. – Pawan Nirpal May 20 '22 at 04:32

0 Answers0