Trying to translate C program to reverse a number into x86 assembly

Question

I'm trying to translate the following program to x86 assembly ( AT&T ).

#include <stdio.h>

int main()
{
   int n = 123;
   int reverse = 0;


   while (n != 0)
   {
      reverse = reverse * 10;
      reverse = reverse + n%10;
      n       = n/10;
   }

   printf("%d\n", reverse);

   return 0;
}

It's supposed to print 321.

However, with the code below, I'm getting a 0 instead. Can anybody give me a clue about what I'm doing wrong here? ( I pasted just the relevant section below. I'm sure that initialization and printing are working fine. You can see the whole thing here)

  movl  $123, %esi    # int n
  movl  $0, %edi    # int reverse
  movl $10, %ebx    # divisor


L1:     # while n != 0

cmpl $0, %esi
je L2

# reverse = reverse * 10
imul $10, %edi

# reverse = reverse + n % 10
movl $0, %edx
movl %edi, %eax
idivl %ebx
addl %edx, %edi

# n = n / 10
movl %esi, %eax
movl $0, %edx
idivl %ebx
movl %eax, %esi

jmp L1

L2:  # end while

movl %edi, %eax

Maybe I'm not yet perfectly understanding what the idivl command is supposed to do. I understand that it divides %edx:%eax by %ebx and stores the quotient in %eax and the remainder in %edx.

Yes, that's how `idivl` works. Use your debugger to single step the code and see where it goes wrong. — Jester, Apr 20 '18 at 12:20
One `idiv` gives you both quotient and remainder; you don't need to run it twice with the same inputs. (Oh, I think this was actually your bug; *not* using the same inputs even though you want `n/10` and `n%10`.) — Peter Cordes, Apr 20 '18 at 12:49
Notice that you can use `xor %reg, %reg` to set a register to 0, which will be faster than doing `movl $0, %reg` — Cyclonecode, Apr 20 '18 at 13:06

davmac · Accepted Answer · 2018-04-20T12:59:34.280

5

# reverse = reverse + n % 10
movl $0, %edx
movl %edi, %eax   ; <--- here

%edi is not n, according to the comments above:

movl  $123, %esi    # int n

So, it should be using %esi, i.e. movl %esi, %eax.

edited Apr 20 '18 at 12:59

answered Apr 20 '18 at 12:26

davmac

20,150
1
40
68

That was the only thing missing. Now it works perfectly. – Ruan Apr 20 '18 at 12:28

score 1 · Answer 2 · answered Apr 20 '18 at 12:40

1

sometimes it is good to see what the compiler generates

int reverse(int x)
{
   int r = 0;

   while (x != 0)
   {
      r = r * 10;
      r = r + x%10;
      x = x/10;
   }
   return r;
}

and shortest version:

reverse:
  xor eax, eax
  mov esi, 10
.L2:
  test edi, edi
  je .L5
  imul ecx, eax, 10
  mov eax, edi
  cdq
  idiv esi
  mov edi, eax
  lea eax, [rdx+rcx]
  jmp .L2
.L5:
  ret

or the fastest:

reverse:
  xor eax, eax
  test edi, edi
  je .L4
  mov esi, 1717986919
.L3:
  lea ecx, [rax+rax*4]
  mov eax, edi
  imul esi
  mov eax, edi
  sar eax, 31
  sar edx, 2
  sub edx, eax
  lea eax, [rdx+rdx*4]
  add eax, eax
  sub edi, eax
  test edx, edx
  lea eax, [rdi+rcx*2]
  mov edi, edx
  jne .L3
  rep ret
.L4:
  rep ret

as you see the compilers same good/better than the 99.99% of the coders

answered Apr 20 '18 at 12:40

0___________

60,014
4
34
74

For your "fastest" version, looks like you forgot to tell gcc to optimize for a modern CPU; it's not taking advantage of macro-fusion. Use `-mtune=haswell`, or better `-march=haswell`. You could also get faster code if you used unsigned instead of signed division. I don't think you want your remainder to be `-9..0` for negative inputs anyway. But yes, [using `idiv` for constant divisors is well known to be sub-optimal](https://stackoverflow.com/questions/41183935/why-does-gcc-use-multiplication-by-a-strange-number-in-implementing-integer-divi) – Peter Cordes Apr 20 '18 at 12:46
@PeterCordes - in this example I do not see many possible improvements. – 0___________ Apr 20 '18 at 12:58
@PeterCordes OP wants signed to have the signed result – 0___________ Apr 20 '18 at 13:00
Ok, that's fair, I was thinking the algo would be broken with negative inputs, but it does actually work. You do get `-321` from `-123` https://godbolt.org/g/u9rrru. Still, for larger numbers it would be more efficient to handle the sign outside the loop, instead of doing extra work for signed division / modulo with the fixed-point multiplicative inverse. – Peter Cordes Apr 20 '18 at 13:04
The improvement you'd get from `-march=haswell` is putting the `test` right before the `jne`, saving 1 total uop. Also removing the `rep` prefix from the `ret`, because we probably don't care about AMD K10. – Peter Cordes Apr 20 '18 at 13:05
@PeterCordes you are missing the point of my answer. I just wanted to show the OP that it is a complete waste of time writing this kind of code in the assembler. – 0___________ Apr 20 '18 at 13:06
Oh. I thought it was a challenge to the 0.01% of coders to find the optimizations gcc missed, sorry. :P I did notice the OP was zeroing `edx` (inefficiently), instead of `cdq`, so the OP either doesn't care about negative numbers, or else has another bug they haven't discovered yet. – Peter Cordes Apr 20 '18 at 13:10
I'm just getting started in asm so this was just an exercise to get used to the standard commands. I'm well aware that I can't beat the compiler. – Ruan Apr 20 '18 at 18:06

Trying to translate C program to reverse a number into x86 assembly

2 Answers2