2

I'm trying to translate the following program to x86 assembly ( AT&T ).

#include <stdio.h>

int main()
{
   int n = 123;
   int reverse = 0;


   while (n != 0)
   {
      reverse = reverse * 10;
      reverse = reverse + n%10;
      n       = n/10;
   }

   printf("%d\n", reverse);

   return 0;
}

It's supposed to print 321.

However, with the code below, I'm getting a 0 instead. Can anybody give me a clue about what I'm doing wrong here? ( I pasted just the relevant section below. I'm sure that initialization and printing are working fine. You can see the whole thing here)

  movl  $123, %esi    # int n
  movl  $0, %edi    # int reverse
  movl $10, %ebx    # divisor


L1:     # while n != 0

cmpl $0, %esi
je L2

# reverse = reverse * 10
imul $10, %edi

# reverse = reverse + n % 10
movl $0, %edx
movl %edi, %eax
idivl %ebx
addl %edx, %edi

# n = n / 10
movl %esi, %eax
movl $0, %edx
idivl %ebx
movl %eax, %esi

jmp L1

L2:  # end while

movl %edi, %eax

Maybe I'm not yet perfectly understanding what the idivl command is supposed to do. I understand that it divides %edx:%eax by %ebx and stores the quotient in %eax and the remainder in %edx.

Ruan
  • 772
  • 4
  • 13

2 Answers2

5
# reverse = reverse + n % 10
movl $0, %edx
movl %edi, %eax   ; <--- here

%edi is not n, according to the comments above:

movl  $123, %esi    # int n

So, it should be using %esi, i.e. movl %esi, %eax.

davmac
  • 20,150
  • 1
  • 40
  • 68
1

sometimes it is good to see what the compiler generates

int reverse(int x)
{
   int r = 0;

   while (x != 0)
   {
      r = r * 10;
      r = r + x%10;
      x = x/10;
   }
   return r;
}

and shortest version:

reverse:
  xor eax, eax
  mov esi, 10
.L2:
  test edi, edi
  je .L5
  imul ecx, eax, 10
  mov eax, edi
  cdq
  idiv esi
  mov edi, eax
  lea eax, [rdx+rcx]
  jmp .L2
.L5:
  ret

or the fastest:

reverse:
  xor eax, eax
  test edi, edi
  je .L4
  mov esi, 1717986919
.L3:
  lea ecx, [rax+rax*4]
  mov eax, edi
  imul esi
  mov eax, edi
  sar eax, 31
  sar edx, 2
  sub edx, eax
  lea eax, [rdx+rdx*4]
  add eax, eax
  sub edi, eax
  test edx, edx
  lea eax, [rdi+rcx*2]
  mov edi, edx
  jne .L3
  rep ret
.L4:
  rep ret

as you see the compilers same good/better than the 99.99% of the coders

0___________
  • 60,014
  • 4
  • 34
  • 74
  • For your "fastest" version, looks like you forgot to tell gcc to optimize for a modern CPU; it's not taking advantage of macro-fusion. Use `-mtune=haswell`, or better `-march=haswell`. You could also get faster code if you used unsigned instead of signed division. I don't think you want your remainder to be `-9..0` for negative inputs anyway. But yes, [using `idiv` for constant divisors is well known to be sub-optimal](https://stackoverflow.com/questions/41183935/why-does-gcc-use-multiplication-by-a-strange-number-in-implementing-integer-divi) – Peter Cordes Apr 20 '18 at 12:46
  • @PeterCordes - in this example I do not see many possible improvements. – 0___________ Apr 20 '18 at 12:58
  • @PeterCordes OP wants signed to have the signed result – 0___________ Apr 20 '18 at 13:00
  • Ok, that's fair, I was thinking the algo would be broken with negative inputs, but it does actually work. You do get `-321` from `-123` https://godbolt.org/g/u9rrru. Still, for larger numbers it would be more efficient to handle the sign outside the loop, instead of doing extra work for signed division / modulo with the fixed-point multiplicative inverse. – Peter Cordes Apr 20 '18 at 13:04
  • The improvement you'd get from `-march=haswell` is putting the `test` right before the `jne`, saving 1 total uop. Also removing the `rep` prefix from the `ret`, because we probably don't care about AMD K10. – Peter Cordes Apr 20 '18 at 13:05
  • @PeterCordes you are missing the point of my answer. I just wanted to show the OP that it is a complete waste of time writing this kind of code in the assembler. – 0___________ Apr 20 '18 at 13:06
  • Oh. I thought it was a challenge to the 0.01% of coders to find the optimizations gcc missed, sorry. :P I did notice the OP was zeroing `edx` (inefficiently), instead of `cdq`, so the OP either doesn't care about negative numbers, or else has another bug they haven't discovered yet. – Peter Cordes Apr 20 '18 at 13:10
  • I'm just getting started in asm so this was just an exercise to get used to the standard commands. I'm well aware that I can't beat the compiler. – Ruan Apr 20 '18 at 18:06