2

As the question states, I want to print numbers, that are multiple digits, in aarch64.

Usually when solving proplems I python them first to see if they work, especially since I am very new to anything assembly. My pythonic solution was

number = 1234

while number != 0:
   digit = 1234 % 10         # always gives me the last digit
   print(digit, end='')      # to print the digit and not the next line
   number = number - digit   # to remove the digit I just got so the last digit is 0
   number = number / 10      # to remove said last digit and shorten number by 1

The way I tried to impement this in AARCH64:

printNumber:
        mov  x16, #10                  /* apparently need this for udiv/msub */
        udiv x14, x12, x16             /* x12 is the number I defined above, bc idk what registers are save to use (e.g. when syscall 64, print, happens, 0-8 are used) */
        msub x13, x17, x16, x12        /* thanks to: https://stackoverflow.com/questions/35351470/obtaining-remainder-using-single-aarch64-instruction */
        sub  x12, x12, x13             /* x13 is what above is digit, x12 is number */
        udiv x12, x12, x16
        add  x13, x13, #48             /* digit to string, possible error source 1 */

        mov  x0,  #1                   /* the print part */
        mov  x1,  x13                  /* the other part where I suspect a mistake */
        mov  x2,  #1
        mov  w8,  #64
        svc  #0

        cmp  x12, #0                   /* the loop part */
        beq  exit                      /* genereric exit method I left out */
        b    printNumber

State of the code: It compiles and runs without problems, though it prints nothing, I cannot debug since I Program this with an actual aarch64 device and am not using an emulator (though I am sure there is a way I don't know off)

I hope it is clear what I am trying to do, I am aware of some issues, like I should be using the stack for some of those things, but I can get that to work I hope. I also hope it is visible that I put some effort into this and have tried to look for a solution like this, but can not find anyone who does it like this, or any other non-c-lib way that prints numbers (note that I don't need the number as str, I really only want to print it)

Maritn Ge
  • 997
  • 8
  • 35
  • 5
    The `write` syscall expects a pointer to memory. You need to place your digit into memory. – Jester Apr 27 '21 at 14:08
  • 3
    The `sub` and second `udiv` aren't necessary. Unsigned integer division in AArch64 truncates toward zero, so you already have the new value of `number` in `x14` immediately after the first `udiv`. I think the Python equivalent would be `number = number // 10`. Also, you can move the `mov x16, #10` before the start of the loop; there's no need to redo it on every iteration. – Nate Eldredge Apr 27 '21 at 14:29
  • syscalls are system specific (Linux, windows, roll your own, etc) please tag or list in the question the target system. – old_timer Apr 27 '21 at 14:49
  • x86-64 Linux version of the same question: [How do I print an integer in Assembly Level Programming without printf from the c library?](https://stackoverflow.com/a/46301894) - local array on the stack, store digits into it starting at the tail. Make one `write` system call at the end. You always want to avoid printing 1 char at a time; a system call is *vastly* slower than any single instruction. Higher level languages do I/O buffering for you (and Python makes everything slow) so the effect is much smaller. – Peter Cordes Apr 27 '21 at 14:54
  • 1
    I assume you Python version was supposed to be `number % 10` (so you'll print `4321`) not `1234 % 10` (so you'll print `4444`). – Peter Cordes Apr 27 '21 at 14:58
  • @old_timer i don't know my system tbh, it is a samsung phone – Maritn Ge Apr 27 '21 at 15:03
  • @PeterCordes i see, that must be an oversight in my program. i think i would just push/pop all digits so first in last out turns it around – Maritn Ge Apr 27 '21 at 15:04
  • 2
    @MaritnGe Then it's Linux (Android). Consider getting an ARM device that runs plain Linux instead, such as a Raspberry Pi. You may otherwise run into trouble later on as Android has some restrictions on what sort of code is permitted to run. – fuz Apr 27 '21 at 15:07
  • @fuz i am using it within a linux emulator (termux), and am doing it bc i am using samsung dex for this and am using my phone as a laptop so i can practice it in the train and stuff – Maritn Ge Apr 27 '21 at 15:10
  • 3
    Yes, better to use a sim or a development platform if you are trying to learn basics like this. – old_timer Apr 27 '21 at 15:12
  • 3
    @MaritnGe Termux is not a Linux emulator. It's a terminal emulator similar to those you use on a desktop system. The programs you write run directly on the system without any emulation. – fuz Apr 27 '21 at 15:13
  • 1
    @MaritnGe: If you have a look at the C code in [How do I print an integer in Assembly Level Programming without printf from the c library?](https://stackoverflow.com/a/46301894), you'll see it solves the order problem by decrementing a pointer instead of incrementing, to get the digits in printing order in a buffer. Pushing on the stack and then popping is inefficient and extra work, especially on 64-bit systems where a register is 8 bytes. No need to dirty a couple cache-lines of stack space when you could use one, or to store/reload an extra time. – Peter Cordes Apr 28 '21 at 01:12
  • @PeterCordes this still does not solve my issue of not being able to print the integers/digits though – Maritn Ge Apr 29 '21 at 07:51
  • I didn't say it did, I just said push/pop is unnecessary. (Although that answer includes C you can compile into arm64 assembly, at which point you only need to reserve a buffer on the stack and pass the final pointer to a write syscall when you're done. So it's the algorithm in a pretty usable form, which is why I linked it.) – Peter Cordes Apr 29 '21 at 08:17

1 Answers1

2

As Jester points out, the reason you can't print is that the write system call expects a pointer to the data to be written in x1, not the data itself. You need to store your character to some appropriate address in memory (strb) and pass that address in x1.

One approach would be to use the stack; just subtract from the stack pointer (in multiples of 16, for alignment) to allocate some memory for yourself, and remember to put it back when you're done. Here's an example that should work; XXX are the lines I added or changed.

printNumberEntry:                      /* XXX */
        sub  sp, sp, #16               /* XXX allocate 16 bytes of stack space */
printNumber:
        mov  x16, #10                  /* apparently need this for udiv/msub */
        udiv x14, x12, x16             /* x12 is the number I defined above, bc idk what registers are save to use (e.g. when syscall 64, print, happens, 0-8 are used) */
        msub x13, x14, x16, x12        /* XXX fix unrelated bug */
        sub  x12, x12, x13             /* x13 is what above is digit, x12 is number */
        udiv x12, x12, x16
        add  x13, x13, #48             /* digit to string, possible error source 1 */

        strb w13, [sp]                 /* XXX Store the low byte of x13/w13 in memory at address sp */

        mov  x0,  #1                   /* the print part */
        mov  x1,  sp                   /* XXX x1 points to the byte to be written */
        mov  x2,  #1
        mov  w8,  #64
        svc  #0

        cmp  x12, #0                   /* the loop part */
        beq  exit                      /* genereric exit method I left out */
        b    printNumber

exit:                                  /* XXX */
        add  sp, sp, #16               /* XXX restore stack before returning */
        ret                            /* XXX */

I also fixed an unrelated bug: in your msub, x17 should be x14, since that's where the quotient is.

Another approach would be to reserve a byte in static memory:

        .bss
dataToWrite:
        .resb 1
        .text
printNumberEntry:                      /* XXX */
        adr  x1, dataToWrite           /* XXX keep address in x1 throughout */
printNumber:
        mov  x16, #10                  /* apparently need this for udiv/msub */
        udiv x14, x12, x16             /* x12 is the number I defined above, bc idk what registers are save to use (e.g. when syscall 64, print, happens, 0-8 are used) */
        msub x13, x14, x16, x12        /* XXX fix unrelated bug */
        sub  x12, x12, x13             /* x13 is what above is digit, x12 is number */
        udiv x12, x12, x16
        add  x13, x13, #48             /* digit to string, possible error source 1 */

        strb w13, [x1]                 /* XXX Store the low byte of x13/w13 at dataToWrite */

        mov  x0,  #1                   /* the print part */
        /* x1 already contains the proper address */
        mov  x2,  #1
        mov  w8,  #64
        svc  #0

        cmp  x12, #0                   /* the loop part */
        beq  exit                      /* genereric exit method I left out */
        b    printNumber

exit:                                  /* XXX */
        ret                            /* XXX */
        /* your code */

The downside is this will make your function unusable for signal handlers, multiple threads, etc, as they will all try to use the same byte.

Other notes:

  • The digits print in reverse order, of course. That's something I guess you will work on later.

  • The sub and second udiv are unnecessary, because the first udiv already produces the rounded-down quotient (like Python's x14 = x12 // 10). So you could replace those two instructions with just mov x12, x14.

  • Several of your registers hold constant values throughout the whole function (x16, x1, x2, x8), so you could initialize them outside the loop instead of redundantly redoing it on every iteration. (Note x0 is overwritten with the system call's return value, so you do need to reinitialize it each time.)

  • You really want to find a way to be able to use a debugger. Otherwise, get ready for a lot more instances where you spend half an hour writing a StackOverflow question and waiting many more hours for an answer, for a bug that you could have found yourself in three minutes with a debugger. If you can't install gdb on your Android device, then consider setting up another Linux AArch64 box (say a Raspberry Pi 4, or a cloud server, or even a qemu emulator) where you can.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82