x86 disassemble, involves modular hashing

Question

I have commented what I think the code is doing.

I have tried putting aaaaaaaaa, AAAAAAAAA, !!!!!!!!!, and 000000000, they all work. But it won't accept bbbbbbbbb or 111111111. It seems the password accepts that all ascii character corresponds to 0x21, 0x31, 0x41, ...

Initially, it calls <getchar@plt>, and it goes into a for loop, which accepts 10 characters.

After that, it starts a new loop, which I do not understand. Can you explain this loop? Is this doing modular division? Thanks in advance!

 8048443:   88 44 1d e5             mov    %al,-0x1b(%ebp,%ebx,1)
 8048447:   83 45 f4 01             addl   $0x1,-0xc(%ebp)
 804844b:   83 7d f4 09             cmpl   $0x9,-0xc(%ebp)                  # x ($ebp - 0xc) counter 9
 804844f:   7e ea                   jle    804843b <puts@plt+0xe7>
 8048451:   8b 45 f4                mov    -0xc(%ebp),%eax
 8048454:   c6 44 05 e5 00          movb   $0x0,-0x1b(%ebp,%eax,1)
 8048459:   c7 45 f4 01 00 00 00    movl   $0x1,-0xc(%ebp)                  # counter = 1
 8048460:   eb 15                   jmp    8048477 <puts@plt+0x123>         # start of the loop
 8048462:   8b 45 f4                mov    -0xc(%ebp),%eax
 8048465:   83 e8 01                sub    $0x1,%eax
 8048468:   0f b6 44 05 e5          movzbl -0x1b(%ebp,%eax,1),%eax
 804846d:   0f be c0                movsbl %al,%eax
 8048470:   01 45 f0                add    %eax,-0x10(%ebp)
 8048473:   83 45 f4 01             addl   $0x1,-0xc(%ebp)
 8048477:   83 7d f4 0a             cmpl   $0xa,-0xc(%ebp)                  # 10
 804847b:   7e e5                   jle    8048462 <puts@plt+0x10e>         # end loop
 804847d:   8b 45 f0                mov    -0x10(%ebp),%eax
 8048480:   89 c2                   mov    %eax,%edx
 8048482:   c1 fa 1f                sar    $0x1f,%edx
 8048485:   c1 ea 1c                shr    $0x1c,%edx
 8048488:   01 d0                   add    %edx,%eax
 804848a:   83 e0 0f                and    $0xf,%eax                        # only look at the lowest four bits
 804848d:   29 d0                   sub    %edx,%eax
 804848f:   89 45 f0                mov    %eax,-0x10(%ebp)
 8048492:   83 7d f0 03             cmpl   $0x3,-0x10(%ebp)                 # compare to 3
 8048496:   75 16                   jne    80484ae <puts@plt+0x15a>         # go to wrong answer
 8048498:   b8 b4 85 04 08          mov    $0x80485b4,%eax
 804849d:   8d 55 e5                lea    -0x1b(%ebp),%edx
 80484a0:   89 54 24 04             mov    %edx,0x4(%esp)
 80484a4:   89 04 24                mov    %eax,(%esp)
 80484a7:   e8 98 fe ff ff          call   8048344 <printf@plt>             # correct answer
 80484ac:   eb 0c                   jmp    80484ba <puts@plt+0x166>
 80484ae:   c7 04 24 e2 85 04 08    movl   $0x80485e2,(%esp)
 80484b5:   e8 9a fe ff ff          call   8048354 <puts@plt>               # wrong answer```

FYI, commenting out and commenting are different. The former means turning code into a comment block so it doesn't execute, and the latter means adding comments to augment the code. You "commented" the code. — Erik Eidt, Apr 09 '20 at 16:56
You left out the top of the first loop; `jle 804843b` jumps backwards to a few bytes before the `mov %al,-0x1b(%ebp,%ebx,1)`. I guess that's the getchar loop you're talking about? Ugh, this is un-optimized code that stores everything in memory all the time. — Peter Cordes, Apr 09 '20 at 19:33
The sar/shr part is doing `n %= 16;` with a signed integer type, which has to produce a negative remainder for negative `n`. This is why you should use `unsigned` or write `n &= 15;` when you *don't* want that, and want it to compile efficiently. It adds/subtracts 15 for negative `n` before/after masking. Also, this shows your code was probably compiled by gcc4.7 or older; newer gcc are smart enough to use `cltd` to sign-extend EAX into EDX:EAX instead of mov/sar 31. https://godbolt.org/z/tRLEkq shows the same code-gen from `return `n%16`. — Peter Cordes, Apr 09 '20 at 19:47

Peter Cordes · Answer 1 · 2020-04-09T20:53:14.380

I was about to post an answer about the sar/shr/add/and/sub part of this code on Reverse engineering code that does SAR by 31, shr, then add and subtracts that around an AND but it was deleted about a minute before I was done. That part of the code probably pops up often enough to be worth answering:

The code around AND is doing n %= 16; for a signed 32-bit integer type (e.g. int), implementing C signed remainder semantics. For non-negative integers, it's equivalent to n &= 0xf; i.e. keeping only the low 4 bits.

You can get GCC4.7 to emit exactly this code (Godbolt compiler explorer) for this function:

int mod16(int n) {
    return n % 16;
}

# gcc4.7.4 -O1 -m32 -mregparm=1  (first arg in EAX instead of on the stack)
mod16(int):
    movl    %eax, %edx
    sarl    $31, %edx
    shrl    $28, %edx
    addl    %edx, %eax
    andl    $15, %eax
    subl    %edx, %eax
    ret                     # return with n%16 in EAX.

Fun fact: gcc4.8 and newer use cdq instead of mov/sar $31 to sign-extend EAX into EDX:EAX. i.e. to set all bits of EDX = sign bit of EAX.

The store/reloads in your code, and using EBP as a frame pointer, indicate it was compiled without (extra) optimization. GCC more than most compilers will still often use it's efficient ways to do things within a single expression. I used -O1 on Godbolt to avoid making the compiler store/reload to memory.

In C, % has to produce a negative remainder for negative n, but it's still much more efficient to emulate that around an AND instruction to implement signed-integer division semantics than to use idiv for powers of 2. (Same deal for non power of 2 compile-time constants with a multiplicative inverse.)

Recall that (n/16)*16 + n%16 = n for all n, and that C signed division truncates towards 0. (This applies for any divisor, not just positive powers of 2). So for example -22/16 = -1 and -22 % 16 = -6.
-1 * 16 + -6 = -22. A compiler has to emit code that works for every possible n; it (usually) doesn't know that your int is never negative when your code runs.

Arithmetic right shift rounds towards -Inf so it needs similar fixups to implement / division semantics.

This is why you should use unsigned or n &= 15; when you don't need negative remainders, and want it to compile efficiently instead of to this over-complicated sequence. This extra cruft around and $0xf, %eax can optimize away if the compiler can prove that the value is non-negative, but that can't always happen (and never in a debug build).

This is part of the code from someone else's question a couple hours ago, so we can supply the missing first instruction that loads EAX from -0x10(%ebp)

 804847d:   8b 45 f0                mov    -0x10(%ebp),%eax    # EAX = n;  you left out this insn

 8048480:   89 c2                   mov    %eax,%edx
 8048482:   c1 fa 1f                sar    $0x1f,%edx         # edx = 0 or -1  (sign bit of n)

 8048485:   c1 ea 1c                shr    $0x1c,%edx         # edx = 0 or 15

Logical right shift by 28 always shifts in zeros, instead of copies of the sign bit, so that turns -1 (0xffffffff) into 15. (32-28 = 4 bits left at the bottom of the register). And of course leaves 0 as 0.

So EDX = 0 or 15, for for non-negative or negative n.

 8048488:   01 d0                   add    %edx,%eax
 804848a:   83 e0 0f                and    $0xf,%eax                        
# only look at the lowest four bits
 804848d:   29 d0                   sub    %edx,%eax

It adds/subtracts 15 (or 0) around the AND.

This is an optimized way to do -((-eax) & 15) for negative inputs, based on the 2's complement identity that -x = ~x + 1 combined with &.

Note that x & 0xf results in a number in the [0..15] range, so that minus 15 will always be negative or zero. If the input was already a multiple of 16 (even if negative because that's how 2's complement works), the low bits will be 0 to start with. So (n + 15) & 15 = 15, and 15-15 = 0, the correct result for -128 % 16 for example.

For originally-negative inputs, this is just adding/subtracting 0, the additive identity which leaves the value unchanged. So the whole thing is equivalent to n & 0xf for non-negative n.

The whole thing is equivalent to (eax < 0) ? -((-eax) & 15) : (eax & 15)

GCC using it for n % 16 also shows it gives the right result for every possible int. (There are no n values that have C undefined behaviour for n % 16).

 804848f:   89 45 f0                mov    %eax,-0x10(%ebp)    # store back into where we loaded from

So we know it's n %= 16; because it's storing the result back on top of the stack slot it loaded from originally.

 8048492:   83 7d f0 03             cmpl   $0x3,-0x10(%ebp)                 
# compare to 3

Yes, that's right. FLAGS are set according to n - 3 because AT&T syntax is backwards. But ZF is set if they're equal.

x86 disassemble, involves modular hashing

1 Answers1