0

I want to use inline-assembly in a C program, without using constraints, since I don't want to have to explain that in my complex enough school paper. I'm attempting to xor some characters with a key. Here's the code:

#include <stdio.h>
char text[11] = {'H','e','l','l','o'};
int key = 42;
int e;
char z;
int main()
{
    for (e=0; e<12; e++){
    z = text[e];
    asm volatile(
        "movl $z, %ax;" \
        "movl $key, %bx;" \
        "xor $ax, %bx;" \
        "movl $ax, %[z];");
    printf("%c\n", z);
    }
return 0;
}

but I keep getting: main.c:11:13: error: undefined named operand ‘z’ "movl $key, %ax;" \ I tried adding static in front of the variables, since it could be a compiler/linker optimization thing. I also tried adding __attribute__((used)) but none of it helped. I am using GNU GCC v7.1.1 Any help is appreciated, thanks.

Klaus Maria
  • 107
  • 1
  • 9
  • Is this the actual code you compile and get the error message for? Also, is this 32 or 64 bit code? – fuz Jan 10 '21 at 18:32
  • 2
    To fix the immediate problem you have use `z` and `key` instead of `%[z]`, `$z`, and `$key` to access the variables. Note that the code is still incorrect in that you clobber registers without informing the compiler of this. – fuz Jan 10 '21 at 18:33
  • @fuz I guess it doesn't really matter if it's for 32 or 64 bit, as long as the charachters are xor-ed correctly. May you please elaborate on the clobbering registers part? Thanks – Klaus Maria Jan 10 '21 at 18:44
  • 1
    It does matter in that the required addressing mode is different (for 64 bit mode, you need `z(%rip)` instead of `z`). The compiler assumes that after an `asm` statements, all registers retain their prior values. You violate this assumption by overwriting `ax` and `bx`, causing behaviour to be undefined. It may appear to work at the moment, but may fail in mysterious ways later on. If you like I can write an answer outlining how to do it correctly. – fuz Jan 10 '21 at 18:57
  • 3
    It's impossible to safely use GNU C Basic asm (no constraints) inside a function, if you modify any registers, or any memory variables that the rest of your code reads or writes. For x86-64, you also can't use the stack to push/pop because that would clobber the red-zone. You might come up with code that might happen to work (in a debug build), but it's impossible to make it actually safe. Writing massively buggy code and not explaining it seems like a very bad idea. – Peter Cordes Jan 10 '21 at 19:07
  • 1
    (Although at some point GCC decided to make non-empty Basic asm statements have an implicit `"memory"` clobber, but this is not documented and is basically only to make old and/or buggy code happen-to-work more often. Do not depend on it in new code. https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended) – Peter Cordes Jan 10 '21 at 19:08
  • Also, it's unlikely that your var addresses fit in 16 bits, so you'd get relocation errors when you try to link. Use pointer-sized registers like `%edx` or `%rdx`. – Peter Cordes Jan 10 '21 at 19:10
  • @fuz that is supposed to be the full program, it's an example for my school paper. I guess I can keep it 32-bit to keep it simple. Can I solve the register problem by first pushing them to the stack, later popping again? Thanks so much by the way. If you want to answer please do so. Just read I can't push/pop :( – Klaus Maria Jan 10 '21 at 19:42
  • can I just use othr registers such as edx and ecx instead? – Klaus Maria Jan 10 '21 at 19:50
  • @KlausMaria The correct solution is to use *extended inline assembly* and declare a *clobber list* including the registers you use. This tells the compiler that certain registers are being overwritten. It also allows you to fix some unportable assumptions your code makes (e.g. with respect to symbol decoration). Let me write an answer real quick. – fuz Jan 10 '21 at 20:07
  • @PeterCordes IIRC basic asm comes with an implicit `"memory"` clobber. And some common uses are safe, e.g. injecting comments, break points, or long nops for tagging. – fuz Jan 10 '21 at 20:08
  • @fuz thanks for the answer. I was trying not to use constraints/extended asm for my project, but it seems to be a so horrible idea, i'll take the extra explaining in my paper. thanks again – Klaus Maria Jan 10 '21 at 21:07
  • 1
    Just curious, why would you not just write this in C? There's a perfectly good `^` operator to do xor. The assembly has many bugs (not just the lack of constraints, e.g. using 16-bit registers to operate on 8-bit quantities) and will unavoidably be harder to explain. – Nate Eldredge Jan 10 '21 at 21:40
  • Why do you want to use assembly at all? Your reader will understand the C xor operator without explaining anything. The assembly isn’t gaining you anything other than bugs and unnecessary complexity. – prl Jan 11 '21 at 02:47
  • 2
    *it's an example for my school paper* - **Please do *not* write an example of something that nobody should ever do. GNU C inline asm is hard enough without people getting misled by unsafe examples.** Never encourage the use of GNU C Basic asm anywhere except at global scope, or as the entire body of a `__attribute__((naked))` function. Both of those are not fundamentally different from writing a whole function in asm in a separate .S file (which you can call from normal C), which is what you should actually do if you want to avoid dealing with constraints. – Peter Cordes Jan 11 '21 at 03:30
  • @NateEldredge I'm writing a paper on assembly and it's use cases, and so one of the examples is inline-assembly. I was just tring to get around having to use constraints. Is there a C compiler (I don't necessarily NEED GNU GCC) which gets close to what I am attempting? Thanks btw. – Klaus Maria Jan 11 '21 at 07:21
  • If you’re trying to show a use case for inline assembly, then you should show a use case that actually makes sense as a reasonable use, and also that is implemented properly. – prl Jan 11 '21 at 09:43
  • @Peter, I use basic asm for a few things that I think are safe, such as `asm(“pause”)`. Also cli, sti, hlt, wbinvd. – prl Jan 11 '21 at 10:06
  • @prl: Almost always, you want those instructions to be ordered wrt. memory accesses in the surrounding C. You don't get that from Basic Asm except as an undocumented extension where it implicitly has a `"memory"` clobber. Better to make it explicit, like `asm("pause" :)` to explicitly *not* use a memory clobber (by making it Extended asm), or `asm("sti" :::"memory");` where you do want memory in sync with the abstract machine before re-enabling interrupts. Or for pause, use `_mm_pause();`. Or you could use a dummy memory input operand to order `pause` wrt. one store or RMW. – Peter Cordes Jan 11 '21 at 10:11
  • @prl what use cases would make sense for inline-assembly. Thanks btw. – Klaus Maria Jan 11 '21 at 11:16

1 Answers1

2

If possible, avoid using inline assembly at all.

If inline assembly has to be used, use extended inline assembly if at all possible and declare correct operands and clobber lists. In your particular case, something like this would work:

#include <stdio.h>
char text[11] = {'H','e','l','l','o'};
int key = 42;
int e;
char z;
int main()
{
    for (e=0; e<12; e++){
        z = text[e];
        asm (
            "movzbl %0, %%eax;" \
            "movl %1, %%ebx;" \
            "xor %%ebx, %%eax;" \
            "movb %%al, %0;"
            : "+m"(z) : "m"(key) : "eax", "ebx");
        printf("%c\n", z);
    }

    return 0;
}

Here, "+m"(z) is an output operand whereas "m"(key) is an input operand. "eax", "ebx" is the clobber list indicating which registers your code overwrites. As all side effects of the asm statement are made explicit through its output operands and clobbers, the volatile keyword can be omitted to give greater flexibility to the compiler.

Note that unless key, z, and e being global variables is a requirement, it is a good idea to make them into automatic variables and to use an additional r or i constraint for the input and output operands to give the compiler more flexibility with chosing where to keep the variables.

Also pay attention to the corrected operand sizes. key is a 32 bit variable and z is an 8 bit variable. Lastly, I've fixed you xor instruction. You have accidentally switched the operands.

Note that the assembly code can be done better. A slightly more efficient way to render the code would be like this:

#include <stdio.h>
char text[11] = {'H','e','l','l','o'};
int key = 42;
int e;
char z;
int main()
{
    for (e=0; e<12; e++){
        z = text[e];
        asm (
            "xorb %b1, %0"
            : "+m"(z) : "r"(key));
        printf("%c\n", z);
    }

    return 0;
}

This delegates loading key from memory to the C compiler, allowing it to make more informed choices on where to keep the variable. It might also be a good idea to make key and z into automatic variables, but I am unsure if this is allowed under some unnamed constraints you might have.

fuz
  • 88,405
  • 25
  • 200
  • 352
  • 2
    The constraint for `key` can be `"ir"` to allow an immediate. Also, the asm statement doesn't have to be `volatile` because the only effect you want from it is modification of one of the explicit operands, `"+m"(z)`. And the variables can all be local; I think the OP only made them global so they'd have asm symbol names. – Peter Cordes Jan 11 '21 at 03:33
  • Also, I'd take out the "if possible" from your suggestion to use Extended Asm. Seriously do not present Basic Asm as a valid option for this, because it really isn't without jumping through extraordinary hoops (like `add $-128, %rsp` to skip past the red zone, then save all the regs you want to use, defeating the efficiency of making it inline instead of a separate function). – Peter Cordes Jan 11 '21 at 03:37
  • And even more fundamentally, reading or modifying global vars (the only thing you *could* do) is not safe unless you're using a version of GCC (not clang AFAIK) that happens to have the patch from a few years ago to treat non-empty Basic `asm("...")` statements as if they had a `"memory"` clobber. On older GCC (and thus GCC in general) there's literally no safe way to touch any C state from Basic asm inside a function. – Peter Cordes Jan 11 '21 at 03:39
  • @PeterCordes I don't see how “If inline assembly has to be used, use extended inline assembly if possible and declare correct operands and clobber lists.” can be read to endorse basic inline assembly as a viable alternative. I'll do some copy editing for some of the rest. – fuz Jan 12 '21 at 02:44
  • It leaves open the possibility of someone convincing themselves that Extended asm wasn't "possible" for them for some reason, and then going ahead with Basic asm. Your wording implies there is a valid fallback option. e.g. like [Inline assembly statements in C code and extended ASM for ARM Cortex architectures](https://stackoverflow.com/q/65460597) where someone has an existing codebase of no-constraint statements like `asm("dmb")`, maybe for ARMcc, and wants to have them work in GNU C. My answer there points out that this only works as a nasty hack dependent on an implicit memory clobber – Peter Cordes Jan 12 '21 at 03:05