0

I am testing use of .bss for allocation of a memory area to hold a single number. Then print that number to console. The output is not as expected. I am supposed to get e number (12), but get a newline.

System config:

$ uname -a
Linux 5.8.0-48-generic #54~20.04.1-Ubuntu SMP Sat Mar 20 13:40:25 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

description: CPU
product: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz

The code:

# compile with: gcc -ggdb -nostdlib -no-pie  test.s -o test

.bss
.lcomm          output,1

.global _start
.text

_start:
        # test .bss and move numer 12 to rbx where memory are allocated in .bss
        mov     $output, %rbx    # rbx to hold address of allocated space
        mov     $12,%rdx          # Move a number to rdx
        mov     %rdx,(%rbx)       # Move content in rdx to the address where rbx points to (e.g ->output)

        # setup for write syscall:  
        mov     $1,%rax          # system call for write, according to syscall table (http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/)
        mov     $1,%rdi          # fd = 1, stdout
        mov     $output,%rsi     # adress of string to output moved to rsi
        mov     $1,%rdx          # number of bytes to be written

        syscall                  # should write 12 in console

        mov     $60,%rax
        xor     %rdi,%rdi
        syscall                 # exit normally

I have set a breakpoint with the first syscall (using GDB), to look into the registers:

i r rax rbx rdx rdi rsi

rax            0x1                 1
rbx            0x402000            4202496
rdx            0x1                 1
rdi            0x1                 1
rsi            0x402000            4202496

x/1 0x402000
0x402000 <output>:  12

The output after syscall is blank, was expected to get the number "12":

:~/Dokumenter/ASM/dec$ gcc -ggdb -nostdlib -no-pie  test.s -o test
:~/Dokumenter/ASM/dec$ ./test

:~/Dokumenter/ASM/dec$ ./test

:~/Dokumenter/ASM/dec$ 

So, my question is, are there any obvious explanation of why I am getting blank and not 12 ?

  • 1
    It wrote the byte 12, which in ASCII is a form feed. If you want to see the two characters `12`, you need to write those two characters, e.g. `1` is ASCII code 49 (`0x31`). The `write` system call won't do binary-to-decimal conversion for you. – Nate Eldredge Apr 08 '21 at 22:37
  • By the way, your `mov %rdx,(%rbx)` writes 8 bytes into a variable that only has space for 1. You are going to want to learn about operand size. However, that sequence of three instructions can be replaced by `movb $12, output`. – Nate Eldredge Apr 08 '21 at 22:38
  • Thanks, but even if i increase the size to 8, (`.lcomm output,8`), and write different bytelength (2,4,8) - (`mov $8,%rdx`), it still just prints a newline.. Also simplified the code to test movb, but same result.. (`movb $12,output`) `mov $2,%rdx # number of bytes to be written` – Martin Haneferd Apr 08 '21 at 23:00
  • Yeah, because you are now writing out the two bytes `12, 0` which will NOT display as the numerals `1` and `2`; it's a form feed followed by a null character. You have to write out the two bytes `49, 50`. To see what I mean, do `movb $49, output` and `movb $50, output+1`. Look at `man ascii` to see what numerical values correspond to what characters. – Nate Eldredge Apr 08 '21 at 23:07
  • 1
    You can also write `movb $'1', output` and the assembler will work out the correct ASCII value for you. – Nate Eldredge Apr 08 '21 at 23:08
  • Thanks again @Nate. I finally got it; Doing this with UTF-8. Had to use two bytes for each number: ` movb $0x31,output movb $0x32,output+1 and mov $2,%rdx` So,, next thing is to try and create an UTF-8 converter :-) Thanks again. – Martin Haneferd Apr 08 '21 at 23:39
  • I think you may still be confused. The `$0x31` is **one** byte. It corresponds to the **character** `1`. The **number** twelve needs **two** characters to display it, because it has two decimal digits, even though its numerical value can be stored in a single byte. UTF-8 incorporates ASCII so there is no difference between them in this case. – Nate Eldredge Apr 09 '21 at 00:54
  • There's an example of a binary-to-decimal conversion routine at https://stackoverflow.com/questions/45835456/printing-an-integer-as-a-string-with-att-syntax-with-linux-system-calls-instea/45851398#45851398. – Nate Eldredge Apr 09 '21 at 00:56
  • Hi again Nate I got the two byte part. My answer back were probably not clear about that. Thanks for the link, I’ll try implement that, so I can have different numbers displayed. Not only ‘12’ :-) Thanks again. – Martin Haneferd Apr 09 '21 at 12:20

1 Answers1

2
mov     $output,%rsi     # address of string to output moved to rsi
                                      ^^^^^^

Address of string. The value $12 is not the character sequence "12". If you wanted to print the string 12, you would need to load 0x31 and 0x32 ('1' and '2') into the memory area (making it big enough) the use 2 as the length.

For example, movw $0x3231, output or better movw $0x3231, output(%rip) to use RIP-relative addressing for static data, like normal for x86-64. (Unlike NASM, GAS syntax doesn't $'12' as a way to write the same integer constant.)

If you want to print an integer as a string, you'll probably want to manipulate it mathematically so you can do it one digit at a time. (Printing an integer as a string with AT&T syntax, with Linux system calls instead of printf)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Thanks. I got it!! Had to use two bytes for each number: ` movb $0x31,output movb $0x32,output+1 and mov $2,%rdx` – Martin Haneferd Apr 08 '21 at 23:41
  • @MartinHaneferd: You can of course do store both byte values as part of one word store; see my edit. (Remember x86 is little-endian). Also, since you're putting the address into a register anyway, might as well use RSI which you eventually want for the syscall, instead of separately using RBX. `mov $2, %edx` is a more efficient way to write a small positive integer into a 64-bit register. Not everything should be 64-bit operand-size, for example `mov %rdx,(%rbx)` is a 64-bit store. – Peter Cordes Apr 09 '21 at 01:17
  • 1
    @Martin, there's nothing in your comment addressing my "making it big enough" comment (though you may have done so without commenting). If not, you probably want to ensure the `.lcomm` allocates more than a single byte. – paxdiablo Apr 09 '21 at 03:31
  • @paxdiablo , Yes, I increased the .lcomm with two bytes. I also figured out that by slicing the number into two digits (bytes) and then add 0x30 to each of them and put in byte 1 and 2, it worked. So the main answer to my problem where that presenting a regular number to text/string, it has to be converted to ascii first. So, once again., thanks for answer. My head is back on track :-) – Martin Haneferd Apr 09 '21 at 12:17