0

The following code I have reads the input from user similar to gets function in C language.

section .text
global _start

_start:
    mov eax, 3          ; Read user input into str
    mov ebx, 0          ; |
    mov ecx, str        ; | <- destination
    mov edx, 100        ; | <- length
    int 80h             ; \

    mov eax, 1          ; Return
    mov ebx, 0          ; | <- return code
    int 80h             ; \

section .data
    str: times 100 db 0 ; Allocate buffer of 100 bytes

I am not sure how Linux is handling my code, but I am curious how this code is handled on an Intel machine natively (without any OS). As far as I know, interrupt 80h is searched through an interrupt vector table and the related code function is called. But, going one step lower, how is that code written? I want to know how is the algorithm that handles such a functionality?

Can anyone please advise how to find the complete function code for handling user input with the lowest level on an actual machine? I am interested in both knowing the function and also finding the way to obtain such a code.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
ar2015
  • 5,558
  • 8
  • 53
  • 110
  • 1
    You removed the "linux-kernel" tag? You know this user-space code depends on an OS like Linux, right? It's just calling into the kernel. Your question asks "how is that code written", and in this case *that code* IS the Linux kernel. There isn't any x86 machine code separate from the OS that "runs" here, the OS talks to the real HW directly via PCI messages generated by the CPU from x86 load/store instructions (compiler-generated from C). – Peter Cordes Oct 26 '19 at 12:08
  • 1
    I think you're hoping that your question can have a short answer with only a few instructions on the kernel side. That's very far from the case, unless you want to use BIOS or UEFI services. But that's no different from using system calls under an OS as far as what kind of CPU instructions actually drive the hardware. – Peter Cordes Oct 26 '19 at 12:10
  • And BTW, I removed the [tag:interrupt] tag because it's just the mechanism for invoking a 32-bit Linux system call. `int xx` is not involved in actually accessing the keyboard hardware, and the CPU doesn't have any built-in interrupt handlers that do that. (The BIOS might set some up, but you asked about native / low-level, like how the OS does it internally, which does *not* involve doing through a legacy BIOS interface.) If you ran *this code* without an OS, it would just crash. – Peter Cordes Oct 26 '19 at 12:18
  • @PeterCordes, Thank you very much for your comprehensive answer. I am actually interested in the low level code beyond the operating system. – ar2015 Oct 26 '19 at 12:56
  • There isn't any; a real OS like Linux that has its own drivers (not reliant on BIOS or UEFI services) isn't calling any other code, just its own code. All the x86 instructions executed by the CPU in the process of reading the keyboard and delivering ASCII to your process are instructions contained in the kernel binary. (And X server + terminal emulator). (Unless you access fake PS/2 hardware that's emulated by trapping to system management mode. IDK exactly how that really works, but it's not what happens when you type on a USB keyboard connected to your Linux PC.) – Peter Cordes Oct 26 '19 at 13:01
  • That's why my answer just has links to the Linux kernel code involved in talking to a USB keyboard, and to my other answer about how system calls dispatch inside the kernel. And to the osdev wiki which has info about programming USB drivers. – Peter Cordes Oct 26 '19 at 13:03
  • @PeterCordes, the answer is great but it is not the answer to this question. I would like to see the code of BIOS or where ever the interrupt is coming from in console not in an artificial graphical environment. – ar2015 Oct 28 '19 at 10:48
  • The BIOS's implementation of `int 10h` and [`int 16h`](https://en.wikipedia.org/wiki/INT_16H), or UEFI keyboard I/O, will presumably be similar to Linux's implementation of USB drivers. At the *lowest* level, some store and load instructions will access MMIO registers in the USB host controller. Linux's implementation of it is open-source so I can link it to you. It's not small; the driver code invoked by a keyboard input call runs many instructions. Or are you asking how to just *use* BIOS functions from real mode? – Peter Cordes Oct 28 '19 at 10:54
  • Let's say, on a general embedded processor, I like to implement such a functionality and before that, I would like to know how it works on PC first. – ar2015 Oct 28 '19 at 12:27
  • 1
    Didn't see your reply because you didn't @notify me. On a non-x86 embedded system you generally don't have firmware implementing anything for you except a bootloader. So yes, you would need USB drivers exactly like I linked in my answer. You might simplify to a API for the console screen/keyboard instead of having redirectable file-descriptors, but you'd still need at least a basic USB-HID driver to talk to a USB keyboard. The code for it would presumably "look like" Linux's `usbhid/usbkbd.c`, but could be less portable/modular and not care about as many special cases. – Peter Cordes Oct 29 '19 at 22:10
  • I think you have some preconceived idea that a much shorter simpler answer to your question is possible. I'm pretty sure that is not the case. – Peter Cordes Oct 29 '19 at 22:12

1 Answers1

3

At the lowest level (bare metal), there is no "function" you "call" with an int xx instruction. int xx can only invoke other code running on the CPU, not make something special happen.

Talking to hardware involves running code that uses loads or stores to generate PCI / PCIe transactions, or in/out instructions to access I/O space. (Or at least some access to special physical addresses; on older computers it wasn't necessarily PCI, and a few MMIO addresses on modern CPUs don't actually go off-chip so PCI isn't really involved.)

As far as I know, interrupt 80h is searched through an interrupt vector table and the related code function is called. But, going one step lower, how is that code written?

Yes, the Linux kernel (written in C and assembly) sets up the IDT so user-space int 80h will enter the kernel at its int80 handler entry-point.

int 80h / eax=3 in user-space just dispatches to the sys_read function in the Linux kernel, which implements the POSIX read() function / system call. (See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? for a bit about the kernel side of that dispatching to the table of system calls).

Think of int 0x80 (or the more efficient sysenter and syscall instructions) as ways to make a function call across a privilege boundary. All the magic is (as you guessed) in the implementation of that function inside the kernel, and the other functions it calls. (And the whole Unix model of "everything is a file"; this is the same system call you'd use to read a disk file.)

It takes a file-descriptor arg, in your case stdin = 0. That's just a file, often a TTY or pseudo-tty connected to a terminal emulator, which talks to an X11 server to get keyboard events. You could be running on a Linux text console (ctrl+alt+f2) in which case the kernel is running a terminal emulator with keyboard input coming from whatever physical keyboard(s) is/are connected. Or you could have redirected input from any type of file.

If I redirected input from /dev/input/by-id/usb-Logitech_USB_Receiver-if02-event-kbd on my system, I can get some raw keypress events but not in ASCII text format. (You can safely sudo cat that file; input still goes to your X server as well so you can control-C it). That's a more direct way to talk to the keyboard driver, but you're still going through Linux's HID (Human Interface Device) and event subsystem before you actually get to code that accesses the USB host controller connected to your keyboard.


There's nothing remotely similar you can do on bare metal; you'd use in / out or MMIO loads / stores to talk to a keyboard through a USB host control controller (e.g. eHCI or xHCI). https://wiki.osdev.org/Universal_Serial_Bus

(With BIOS emulation of a PS/2 keyboard, or on an old machine with a real PS/2 keyboard controller, you could talk to that much more easily. Back in early PC days, lots of different programs running in real mode would interact with hardware directly so there were simpler standards for how to access it, often with in and out instructions to well-known port numbers. No PCI bus enumeration or anything needed.

In modern PCs you can still do that, but mostly it's faked by software that traps the accesses. BIOS emulation is by definition not bare metal. Only motherboard firmware developers truly program bare metal. System Management Mode lets the firmware set up hooks that can run even after the OS boots. Fortunately most systems don't do much / any of that, although there are still ACPI tables that a kernel should read instead of probing hardware directly.)


If you boot a legacy bootloader, the firmware will switch back to real mode and set up a bunch of "BIOS services" which you can use via int 10h and other interrupt numbers. Much like running under a Linux kernel, you're not remotely close to talking "directly" to real hardware; all device-driver details are hidden behind a standard API. https://wiki.osdev.org/BIOS

If you boot a modern UEFI bootloader, again you have a standard API for accessing screen / keyboard, with "driver" code provided by the firmware. It's like a minimal kernel. https://wiki.osdev.org/UEFI


Once a real kernel like Linux boots, it does have device drivers for the real USB controllers. If the BIOS had set up emulation of PS/2 hardware, the kernel replaces / disables that.

Like I said, this Linux kernel code is written in C, with some inline asm wrappers for a few things.

This is almost certainly more code than you want to wade through, but it is the answer to your question.

shaedrich
  • 5,457
  • 3
  • 26
  • 42
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847