2

How to read from stdin and write to stdout in inline assembly gcc, just like we do it in NASM:

_start:
mov ecx, buffer ;buffer is a data word initialised 0h in section .data
mov edx, 03
mov eax, 03 ;read
mov ebx, 00 ;stdin
int 0x80
;Output the number entered
mov eax, 04 ;write
mov ebx, 01 ;stdout
int 0x80

I tried reading from stdin in inline assembly and then assign the input to x:

#include<stdio.h>
int x;
int main()
{
    asm(" movl $5,  %%edx \n\t" " 
    movl $0,  %%ebx \n\t" " 
    movl $3,  %%eax \n\t" " 
    int $0x80 \n\t "
    mov %%ecx,x" 
    ::: "%eax", "%ebx", "%ecx", "%edx");

    printf("%d",x);  
    return 0;
}

However it fails to do so.

syscall from within GCC inline assembly

This link contains a code that is able to print only a single character to the stdout.

Community
  • 1
  • 1
  • I am not on linux, so I cannot test. However, some thoughts occur to me: 1) ebx is a file handle. This suggests to me that if no input is ready at the instant the call is made, it may return zero bytes rather than waiting for input. Are you testing the executable with `a.out < foo.txt`? 2) I believe ecx is supposed to be a buffer, not a place to return a single byte. 3) It looks like edx is supposed to be the size of the buffer for ecx (not sure where you are getting 5). 4) This is really inefficiently written asm. I would offer to re-write it, but comments can only be so long and I'm out of s – David Wohlferd Aug 17 '14 at 04:17
  • Your C inline asm version doesn't have a buffer, and doesn't set `%ecx` before `int $0x80`. And the `int $0x80` return value is in `%eax`, not `%ecx`... https://stackoverflow.com/questions/2535989/what-are-the-calling-conventions-for-unix-linux-system-calls-on-x86-64. – Peter Cordes Sep 07 '17 at 09:34
  • See **[What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/questions/46087730/what-happens-if-you-use-the-32-bit-int-0x80-linux-abi-in-64-bit-code)** for why the `int 0x80` version in the answer fails in 64-bit code when the buffer is a local on the stack (not `static` or global). – Peter Cordes Mar 01 '18 at 03:29

1 Answers1

5

This code is based solely on my reading of linux references. I'm not on linux, so I cannot test it, but it should be pretty close. I would test it using redirection: a.out < foo.txt

#include <stdio.h>

#define SYS_READ 3

int main()
{
   char buff[10]; /* Declare a buff to hold the returned string. */
   ssize_t charsread; /* The number of characters returned. */

   /* Use constraints to move buffer size into edx, stdin handle number
      into ebx, address of buff into ecx.  Also, "0" means this value
      goes into the same place as parameter 0 (charsread).  So eax will
      hold SYS_READ (3) on input, and charsread on output.  Lastly, you
      MUST use the "memory" clobber since you are changing the contents
      of buff without any of the constraints saying that you are.

      This is a much better approach than doing the "mov" statements
      inside the asm.  For one thing, since gcc will be moving the 
      values into the registers, it can RE-USE them if you make a 
      second call to read more chars. */

   asm volatile("int $0x80" /* Call the syscall interrupt. */
      : "=a" (charsread) 
      : "0" (SYS_READ), "b" (STDIN_FILENO), "c" (buff), "d" (sizeof(buff))
      : "memory", "cc");

    printf("%d: %s", (int)charsread, buff);

    return 0;
}

Responding to Aanchal Dalmia's comments below:

1) As Timothy says below, even if you aren't using the return value, you must let gcc know that the ax register is being modified. In other words, it isn't safe to remove the "=a" (charsread), even if it appears to work.

2) I was really confused by your observation that this code wouldn't work unless buff was global. Now that I have a linux install to play with, I was able to reproduce the error and I suspect I know the problem. I'll bet you are using the int 0x80 on an x64 system. That's not how you are supposed to call the kernel in 64bit.

Here is some alternate code that shows how to do this call in x64. Note that the function number and the registers have changed from the example above (see http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64):

#include <stdio.h>

#define SYS_READ 0
#define STDIN_FILENO 0

int main()
{
   char buff[10]; /* Declare a buff to hold the returned string. */
   ssize_t charsread; /* The number of characters returned. */

   /* Use constraints to move buffer size into rdx, stdin handle number
      into rdi, address of buff into rsi.  Also, "0" means this value
      goes into the same place as parameter 0 (charsread).  So eax will
      hold SYS_READ on input, and charsread on output.  Lastly, I
      use the "memory" clobber since I am changing the contents
      of buff without any of the constraints saying that I am.

      This is a much better approach than doing the "mov" statements
      inside the asm.  For one thing, since gcc will be moving the 
      values into the registers, it can RE-USE them if you make a 
      second call to read more chars. */

   asm volatile("syscall" /* Make the syscall. */
      : "=a" (charsread) 
      : "0" (SYS_READ), "D" (STDIN_FILENO), "S" (buff), "d" (sizeof(buff))
      : "rcx", "r11", "memory", "cc");

    printf("%d: %s", (int)charsread, buff);

    return 0;
}

It's going to take a better linux expert than me to explain why the int 0x80 on x64 wouldn't work with stack variables. But using syscall does work, and syscall is faster on x64 than int.

Edit: It has been pointed out to me that the kernel clobbers rcx and r11 during syscalls. Failing to account for this can cause all sorts of problems, so I have added them to the clobber list.

David Wohlferd
  • 7,110
  • 2
  • 29
  • 56
  • Thankx a lot. It works fine. However just a small change needed. The char array needs to be made global. http://ideone.com/LNonCE Thank you once again. – Aanchal Dalmia Aug 17 '14 at 14:57
  • Glad you got it working. Since I can't run this myself, your reply surprised me. The code on ideone removes the `"=a" (charsread)`. Did that not return the number of chars read like I expected? Or did you just not need it? And are you saying the interrupt didn't work without buff being global (sure seems like it should)? Or just that you needed it global for some other reason? Information on using int 0x80 seems scarce and future SO readers may want to know. – David Wohlferd Aug 17 '14 at 21:40
  • Yes, if the array wasn't made global the code would not print anything. Don't know the reason. As far as for "=a"(charsread), i didn't need it and thus removed it. Thank you for the help. – Aanchal Dalmia Aug 18 '14 at 06:17
  • Fun fact: `syscall` *doesn't* clobber `"cc"`. Linux saves/restores RFLAGS. Same for `int 0x80`. So this is one of those rare cases where the implicit `"cc"` clobber for x86 / x86-64 could in theory cause missed optimizations :P – Peter Cordes Sep 07 '17 at 06:32
  • @PeterCordes I thought the `syscall` instruction *itself* modified RFLAGS. Having linux save/restore wouldn't matter if the value was changed before the OS gets invoked. – David Wohlferd Sep 07 '17 at 07:06
  • @DavidWohlferd: It stashes the original value in R11. (And RIP in RCX). That's why syscall/sysret clobber RCX and R11. – Peter Cordes Sep 07 '17 at 07:10