11

I have the below code that opens up a file, reads it into a buffer and then closes the file.

The close file system call requires that the file descriptor number be in the ebx register. The ebx register gets the file descriptor number before the read system call is made. My question is should I save the ebx register on the stack or somewhere before I make the read system call, (could int 80h trash the ebx register?). And then restore the ebx register for the close system call? Or is the code I have below fine and safe?

I have run the below code and it works, I'm just not sure if it is generally considered good assembly practice or not because I don't save the ebx register before the int 80h read call.

;; open up the input file 
mov eax,5        ; open file system call number
mov ebx,[esp+8]  ; null terminated string file name, first command line parameter
mov ecx,0o       ; access type: O_RDONLY
int 80h          ; file handle or negative error number put in eax
test eax,eax
js Error         ; test sign flag (SF) for negative number which signals error

;; read in the full input file
mov ebx,eax            ; assign input file descripter
mov eax,3              ; read system call number
mov ecx,InputBuff      ; buffer to read into
mov edx,INPUT_BUFF_LEN ; total bytes to read
int 80h
test eax,eax
js Error               ; if eax is negative then error
jz Error               ; if no bytes were read then error
add eax,InputBuff      ; add size of input to the begining of InputBuff location
mov [InputEnd],eax     ; assign address of end of input

;; close the input file
;; file descripter is already in ebx
mov eax,6       ; close file system call number
int 80h         
Matthew Slattery
  • 45,290
  • 8
  • 103
  • 119
mudgen
  • 7,213
  • 11
  • 46
  • 46
  • suggestion: Do one test for read's result being `<= 0` on the fast path, then sort it out in `Error`. That reduces the amount of branch-prediction history entries your code normally needs. `jle` will work, because `test eax,eax` clears the overflow and carry flags, and sets SF and ZF according to the result the same way `cmp eax, 0` does. – Peter Cordes Feb 01 '16 at 03:12

2 Answers2

11

The int 80h call itself will not corrupt anything, apart from putting the return value in eax. So the code fragment you have is fine. (But if your code fragment is part of a larger routine which is expected to be called by other code following the usual Linux x86 ABI, you will need to preserve ebx, and possibly other registers, on entry to your routine, and restore on exit.)

The relevant code in the kernel can be found in arch/x86/kernel/entry_32.S. It's a bit hard to follow, due to extensive use of macros, and various details (support for syscall tracing, DWARF debugging annotations, etc.) but: the int 80h handler is system_call (line 493 in the version I've linked to); the registers are saved via the SAVE_ALL macro (line 497); and they're restored again via RESTORE_REGS (line 534) just before returning.

JohnnyFromBF
  • 9,873
  • 10
  • 45
  • 59
Matthew Slattery
  • 45,290
  • 8
  • 103
  • 119
  • 1
    In general, according to `syscall(2)`, "some architectures may indiscriminately clobber other registers not listed here". Specifically, x86-64 system calls (made with `syscall`) *do* clobber `rcx` and `r11`, according to my reading of [this writeup of entry_64.S](https://github.com/0xAX/linux-insides/blob/master/SysCall/syscall-2.md). This is backed up by the fact that the `sysret` instruction (used by entry_64.S) does `RIP=RCX`, and `RFLAGS=R11`, and some segment stuff. You're back in user mode after executing it. **AFAICT, x86-64 syscalls preserve everything except R11, RCX, and RAX**. – Peter Cordes Feb 01 '16 at 10:42
  • I was unable to find any specific documentation, or even comments in the code, stating the exact situation for amd64 or i386. I think it's weird that an important point like that is just left for readers to work out from decoding the macros. I mean in practice it's mostly just libc implementors that need that info, but given Linux's commitment to maintaining a stable ABI, I don't expect it to ever change. So it could get written down. (And I guess did, here, so I upvoted :P) – Peter Cordes Feb 01 '16 at 10:47
  • Update yes, `syscall` itself clobbers RCX/R11, and [Linux system calls made using `syscall` only clobber those + `rax`](https://stackoverflow.com/questions/2535989/what-are-the-calling-conventions-for-unix-linux-system-calls-on-i386-and-x86-6). But IIRC, it was my edit that included that; I don't remember if I ever found external documentation other than Linux's source code for this. – Peter Cordes Dec 13 '17 at 07:04
1

Yes, you should save and restore as in http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/040/4048/4048l1.html

  • 3
    That code is saving and restoring `%ebx` so that it can use it to pass in the argument - not because the `int $0x80` itself is corrupting it. – Matthew Slattery Apr 25 '10 at 16:56