0

I read here, that there are different arithmetical assembly instructions for signed operations. However consider code:

unsigned char a = 0xff;
if(a == 0xff){
    // do something
}

I remember in collage I made such code and the if condition was never true, but it was on AVR. I have seen similar thing in my work, so I typed that to the godbolt and for x86 it shows:

mov     BYTE PTR [rbp-1], -1
cmp     BYTE PTR [rbp-1], -1
jne     .L2

which is disturbing, because it shows, that unsigned modifier is ignored. Is it regulated by standard and gcc just simplified for optimalization purposes and actually converted 0xff to unsigned?

EDIT: There was some confusion about my original problem (that's on me I didn't mention my example was meant for 8bit processor) so here's an alternative:

int main(){
    unsigned int a = 0xffffffff;
    if(a == -1)
        return 0;
    return 1;
}

translates to:

main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], -1
        cmp     DWORD PTR [rbp-4], -1
        jne     .L2
        mov     eax, 0
        jmp     .L3
.L2:
        mov     eax, 1
.L3:
        pop     rbp
        ret

which (if I understand correctly) means, that 2^32-1 actually equals -1.

jbulatek
  • 154
  • 10
  • 5
    The unsigned modifier is not ignored. Which instructions would you expect instead of these? –  Sep 12 '20 at 23:45
  • 2
    After `unsigned char a = 0xff;`, the value of `a` is 255. The value of the literal `0xff` is also 255. Even if there is a conversion, when 255 is converted to `unsigned char` or to `int` or to `unsigned int`, the value is still 255. So the compiler would compare 255 to 255. What do you think should happen differently? – Eric Postpischil Sep 13 '20 at 00:07
  • 1
    what you did in **college** was probably using `char` which was a signed type. – Antti Haapala -- Слава Україні Sep 13 '20 at 06:19
  • I expected to see: ```movzx eax, BYTE PTR [rbp-1] cmp eax, -1``` – jbulatek Sep 13 '20 at 12:48
  • But `0xff` isn't `(int)-1`. It's a small positive integer. So `cmp eax, -1` would implement the C logic; it's the wrong constant. But yes, a `movzx` load would explicitly implement the C integer promotion rules of a value-preserving widening of that operand to `==` to `int`, matching the type of `0xff` (which as a numeric literal defaults to int already, no promotion needed). Note that even on an 8-bit processor, `int` is guaranteed by ISO C to be at least 16 bits wide, and `0xff` is comfortably below INT_MAX. We only get a byte-compare with `-1` after optimizing away widening. – Peter Cordes Sep 13 '20 at 14:43

2 Answers2

3

There are not different instructions for most operations for signed and unsigned integers in x86. The representation of -1 in 8 bits is 0xff. So the instructions shown are exactly the same as if the constant were written 0xff.

prl
  • 11,716
  • 2
  • 13
  • 31
  • 2
    P.S. It’s better to not look at unoptimized compiler output. It’s pretty much crap code. – prl Sep 12 '20 at 23:55
  • Ok, so char is an example, because at collage I was working on 8bit processor, but what if there will be -1 (written 0xffff..) that takes whole memory cell? I was asking if such literal will be interpreted as -1 or 2^n-1? So i.e. if my condition will be ever true. – jbulatek Sep 13 '20 at 12:54
1

I expected to see: movzx eax, BYTE PTR [rbp-1] / cmp eax, -1

But 0xff isn't (int)-1. It's a small positive integer. So cmp eax, -1 wouldn't implement the C logic; it's the wrong constant.

Yes, a movzx load would explicitly implement the C integer promotion rules of a value-preserving widening of that operand to == to int (the "usual arithmetic conversions"). This gives you an int matching the type of 0xff (which as a numeric literal defaults to int already, no promotion needed in the C abstract machine). So the movzx part is a valid part of naively implementing the C abstract machine rules.

See Implicit type promotion rules and https://en.cppreference.com/w/c/language/conversion. Note that in the C abstract machine, basically everything you can do with a narrow type promotes it to int before the operation, even something like unary - negation. But compilers are can usually optimize back down to the actual operand-size of the original variable for an operation that ultimately has the same result.

Note that even on an 8-bit processor, int is guaranteed by ISO C to be at least 16 bits wide, and 0xff is comfortably below INT_MAX. We only get a byte-compare with -1 after optimizing away widening.


which (if I understand correctly) means, that 2^32-1 actually equals -1.

Yes, x86 like all modern ISAs uses 2's complement integers. ISO C allows 2's complement, 1's complement, or sign/magnitude.

(Fun fact: C++20 finally dropped the others, specifying only 2's complement. Signed overflow is still undefined behaviour, but that's a separate issue... Modern optimizing C and C++ implementations are very far from portable assembly language.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847