You are right, the first three steps are doing x / 100 -- more or less by multiplying by 1/100 -- but the division is not complete until after the sub %eax, %edi
.
So, to answer your question about the steps which follow the first three, here is the code fragment, annotated:
mov $0x51eb851f,%edx # magic multiplier for signed divide by 100
mov %ecx,%eax # %ecx = x
imul %edx # first step of division signed x / 100
sar $0x5,%edx # q = %edx:%eax / 2^(32+5)
mov %edx,%edi # %edi = q (so far)
mov %ecx,%eax
sar $0x1f,%eax # %eax = (x < 0) ? -1 : 0
sub %eax,%edi # %edi = x / 100 -- finally
imul $0x64,%edi,%eax # %eax = q * 100
sub %eax,%ecx # %ecx = x - ((x / 100) * 100)
Noting:
typically, in this technique for divide-by-multiplying, the multiply produces a result which is scaled up by 2^(32+n) (for 32-bit division). In this case, n = 5. The full result of the multiply is %edx:%eax
, and discarding %eax
divides by 2^32. The sar $05, %edx
divides by the 2^n -- since this is a signed division, it requires an arithmetic shift.
sadly, for signed division the shifted %edx
is not quite the quotient. If the dividend is -ve (and given that the divisor is +ve) need to add 1
to to get the quotient. So sar $0x1f, %eax
gives -1 if x < 0, and 0 otherwise. And the sub %eax, %edi
completes the division.
This step could equally be achieved by shr $0x1f, %eax
and add %eax, %edi
. Or add %eax, %eax
and adc $0, %edi
. Or cmp $0x80000000, %ecx
and sbb $-1, %edi
-- which is my favourite, but sadly saving the mov %ecx, %eax
saves nothing these days, and in any case cmp $0x80000000, %ecx
is a long instruction :-(
it's not clear why this shuffles the quotient q to %edi
-- if it was left in %edx
it would still be there after imul $0x64,%edx,%eax
.