5

is it possible to write a single character using a syscall from within an inline assembly block? if so, how? it should look "something" like this:

__asm__ __volatile__
                    (
                     " movl $1,  %%edx \n\t"
                     " movl $80, %%ecx \n\t"
                     " movl $0,  %%ebx \n\t"
                     " movl $4,  %%eax \n\t"
                     " int $0x80       \n\t"
                     ::: "%eax", "%ebx", "%ecx", "%edx"
                    );

$80 is 'P' in ascii, but that returns nothing.

any suggestions much appreciated!

guest
  • 203
  • 1
  • 3
  • 7

3 Answers3

8

You can use architecture-specific constraints to directly place the arguments in specific registers, without needing the movl instructions in your inline assembly. Furthermore, then you can then use the & operator to get the address of the character:

#include <sys/syscall.h>

void sys_putc(char c) {
    // write(int fd, const void *buf, size_t count); 
    int ret;
    asm volatile("int $0x80" 
            : "=a"(ret)                    // outputs
            : "a"(SYS_write), "b"(1), "c"(&c), "d"(1)  // inputs
            : "memory");                   // clobbers
}

int main(void) {
    sys_putc('P');
    sys_putc('\n');
}

(Editor's note: the "memory" clobber is needed, or some other way of telling the compiler that the memory pointed-to by &c is read. How can I indicate that the memory *pointed* to by an inline ASM argument may be used?)


(In this case, =a(ret) is needed to indicate that the syscall clobbers EAX. We can't list EAX as a clobber because we need an input operand to use that register. The "a" constraint is like "r" but can only pick AL/AX/EAX/RAX. )

$ cc -m32 sys_putc.c && ./a.out
P

You could also return the number of bytes written that the syscall returns, and use "0" as a constraint to indicate EAX again:

int sys_putc(char c) {
    int ret;
    asm volatile("int $0x80" : "=a"(ret) : "0"(SYS_write), "b"(1), "c"(&c), "d"(1) : "memory");
    return ret;
}

Note that on error, the system call return value will be a -errno code like -EBADF (bad file descriptor) or -EFAULT (bad pointer).

The normal libc system call wrapper functions check for a return value of unsigned eax > -4096UL and set errno + return -1.


Also note that compiling with -m32 is required: the 64-bit syscall ABI uses different call numbers (and registers), but this asm is hard-coding the slow way of invoking the 32-bit ABI, int $0x80.

Compiling in 64-bit mode will get sys/syscall.h to define SYS_write with 64-bit call numbers, which would break this code. So would 64-bit stack addresses even if you used the right numbers. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? - don't do that.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Dato
  • 354
  • 3
  • 12
  • 1
    In the first example I consider it a bug. One issue is that `int $0x80` alters _EAX_ since it contains a return value. You'll need to make sure you have some kind of output constraint otherwise the optimizer may just assume that _EAX_ contains the value of SYS_write across the extended assembly template. Your second code example does it properly. – Michael Petch Aug 06 '17 at 18:48
  • Excellent point, thank you! I assume it would be enough to add "%eax" in the clobber list of the first version? (I did want to include a version with no outputs as well.) – Dato Aug 06 '17 at 18:49
  • Don't think am _EAX_ clobber will work there because the compiler may assume that it is in conflict with the input only constraint using `"a"`. I suspect some kind of error would be raised. – Michael Petch Aug 06 '17 at 18:51
  • You’re right. There was a bug opened against gcc about this a while ago, their recommendation was to use a dummy output variable. (Somebody suggested allowing `&` in input operands, but wasn’t met with enthusiasm.) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43998 – Dato Aug 06 '17 at 19:00
  • 1
    The best way to deal with the EAX clobber issue in the first version version is to create a dummy output variable. If you compile with optimizations the optimizer will be able to remove it if its value is never used. Just remember to mark the revised version volatile since the optimizer will think that there are no known side effects it knows about and may optimize the output away entirely). This would work: `int retval; asm volatile("int $0x80" : "=a"(retval): "0"(SYS_write), "b"(1), "c"(&c), "d"(1));` . retval's output will remain unused. – Michael Petch Aug 06 '17 at 19:13
  • Yes, I just did that, thank you. Nevertheless, I’m now seeing that something goes wrong with -O2 and `char` as parameter. It works with clang, but with gcc -O2 (with or without -fno-inline) the address of _c_ never makes it to _ECX_. I guess it has to do something with the alignment of the stack, but I’m really out of my depth. – Dato Aug 06 '17 at 19:41
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/151218/discussion-between-dato-and-michael-petch). – Dato Aug 06 '17 at 19:42
2

IIRC, two things are wrong in your example. Firstly, you're writing to stdin with mov $0, %ebx Second, write takes a pointer as it's second argument, so to write a single character you need that character stored somewhere in memory, you can't write the value directly to %ecx

ex:

.data
char: .byte 80
.text
mov $char, %ecx

I've only done pure asm in Linux, never inline using gcc, you can't drop data into the middle of the assembly, so I'm not sure how you'd get the pointer using inline assembly.

EDIT: I think I just remembered how to do it. you could push 'p' onto the stack and use %esp

pushw $80
movl %%esp, %%ecx
... int $0x80 ...
addl $2, %%esp
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
cthom06
  • 9,389
  • 4
  • 36
  • 28
  • in C i do "the same" with `char *p = 'P'; write(1, &p, 1);`, so stdout is 1 – guest Jun 02 '10 at 14:27
  • char *p = 'P' should be char p = 'P' I suppose. – ShinTakezou Jun 02 '10 at 14:29
  • `write()` takes `const void *buf` as its second argument – guest Jun 02 '10 at 14:32
  • just figured, you are right of course! should be `char p` – guest Jun 02 '10 at 14:39
  • thx! i was just wondering how that would work with `push`.. what is the difference in using `push` and `pushb`, tho? and do i have to pop them, when i'm done? – guest Jun 02 '10 at 14:55
  • @cthom ... it would be a good idea but 1) pushb does not exist, and it would de-align the stack, which is better dword-aigned (or at least word aligned); in fact instead of my pushl, the asker can put `pushw` and end the code with `add $2, %esp`. – ShinTakezou Jun 02 '10 at 15:04
  • @Shin yah, i just noticed that. I don't think unaligning the stack would matter much since you realign right after, fixing it now though. – cthom06 Jun 02 '10 at 15:10
  • no in fact, but since pushb does not exist you'd stick to subl $1, %esp; movb $80, (%esp); .... addl $1, %esp ; ... which is nice anyway, but one instruction longer – ShinTakezou Jun 02 '10 at 15:16
0

Something like


char p = 'P';

int main()
{
__asm__ __volatile__
                    (
                     " movl $1,  %%edx \n\t"
                     " leal p , %%ecx \n\t"
                     " movl $0,  %%ebx \n\t"
                     " movl $4,  %%eax \n\t"
                     " int $0x80       \n\t"
                     ::: "%eax", "%ebx", "%ecx", "%edx"
                    );
}

Add: note that I've used lea to Load the Effective Address of the char into ecx register; for the value of ebx I tried $0 and $1 and it seems to work anyway ...

Avoid the use of external char

int main()
{
__asm__ __volatile__
                    (
                     " movl $1,  %%edx \n\t"
                     " subl $4, %%esp \n\t"
                     " movl $80, (%%esp)\n\t"
                     " movl %%esp, %%ecx \n\t"
                     " movl $1,  %%ebx \n\t"
                     " movl $4,  %%eax \n\t"
                     " int $0x80       \n\t"
                     " addl $4, %%esp\n\t"
                     ::: "%eax", "%ebx", "%ecx", "%edx"
                    );
}

N.B.: it works because of the endianness of intel processors! :D

ShinTakezou
  • 9,432
  • 1
  • 29
  • 39
  • ah. i'm not that good with assembly yet, so i got basically no idea about lea and the like.. also, 0 & 1 are stderr, and stdout resp. iirc – guest Jun 02 '10 at 14:30
  • of course you can use the stack; the gcc-generated code uses ebp it prepared it before; we can stick to use just esp for the moment; I've modified the answer, see it. – ShinTakezou Jun 02 '10 at 14:41
  • awesome!! can it also be done with a `push` and the stackpointer? – guest Jun 02 '10 at 14:45
  • you can mke my code better thanks to suggestion of using push (but pushb does not exist and anyway it would "de-align" the stack...): remove subl $4,%esp and instead of `movl $80, (%esp)` put `pushl $80`; the rest is the same. – ShinTakezou Jun 02 '10 at 14:58
  • uh seen now your comment; yes as said you can use pushl and avoid decrementing the stack by hand; esp is the stackpointer; to use the base/frame pointer ebp or how you call it, you should preserve the previous value of it in use by C for local variables;... so after all it is not so useful, in this case esp is easier – ShinTakezou Jun 02 '10 at 15:01
  • instead of pushl, you can use `pushw $80` and end with `addl $2, %esp`... question of taste, I believe... – ShinTakezou Jun 02 '10 at 15:07
  • makes sense to me. so i will stick to your approach (though first i should read up on this topic a bit).. also, you are my personal hero! thx a lot! – guest Jun 02 '10 at 15:08