0

I am new to assembly and am programming in linux 64 bit in AT&T syntax. If I store the number 1 in a register, how can I translate that to the ascii character "A"? For example:

movl $1, %ebx
addl $64, %ebx

Can I add 64 to 1 to make 65 (the decimal value of A), then somehow convert it to "A" and send this to the buffer using write system call?

EDIT 1: Posting my program code here.

.section .data

message:
        .long 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

length:
        .long 10

.section .text

.globl _start

_start:

xorq %rdi, %rdi  
xorq %rax, %rax  
xorq %rbx, %rbx  
xorq %rcx, %rcx                  
xorq %rdx, %rdx  
movl length, %edx

loop:

        cmpl %ecx, %edx                 
        je loop_end                     
        movl message(,%rdi,4), %eax     
        addl $64, %eax                  
        pushq %rax                      
        incq %rdi                       
        incq %rcx                       
        jmp loop                        



loop_end:

        cmpq $0, %rcx                   
        je exit                         
        popq %rbx                       
        pushq %rcx
        movq $1, %rax
        movq $1, %rdi
        movq %rbx, %rsi                 
        movl length, %edx
        syscall                         
        popq %rcx
        decq %rcx
        jmp loop_end

exit:

        movq $60, %rax
        movq $0, %rdi
        syscall
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Mic
  • 1
  • 2
  • Yes, you can even use `add $'A', %ebx` to make your code more human-readable, and document the purpose of the constant. Characters *are* integers, so yes, you just store the byte in memory and pass a pointer to that memory to `write(0, buf, 1)` – Peter Cordes Sep 15 '16 at 00:00
  • Ok, but if I don't want to use a constant, how would I translate the $65 in the %ebx register to 'A'? Are their any operations that will translate into ascii? – Mic Sep 15 '16 at 00:12
  • Create a temporary buffer. Put a character in it (65 is a value that already represents a character). Pass the address of the buffer to `sys_write` system call. – Michael Petch Sep 15 '16 at 00:14
  • 1
    @Mic: `65` *is* `'A'`, the ASCII encoding of the English letter A. No further conversion or modification is necessary; it already is a character value. (sorry, I should have said `add $'A'-1, %ebx`, since you want EBX to index the alphabet starting from A=1. `64` is `'@'`. One advantage to using symbolic constants is not having to look up the ASCII table as often.) – Peter Cordes Sep 15 '16 at 00:27
  • Thanks for your answers guys. Maybe you can take a look at my code and point me in the right direction. I posted it in the main question. When I run it it doesn't produce anything. I guess I am expecting to see output like: a,b,c,d,e,f,g,h,i,j – Mic Sep 15 '16 at 00:48
  • If you want to write a bunch of ASCII characters, you should use single-byte stores (`movb`) to make a contiguous string, not 64-bit PUSH. Try using `strace ./a.out` to see what system calls (with what args) your program makes. – Peter Cordes Sep 15 '16 at 01:58
  • I see. good point. – Mic Sep 15 '16 at 02:03

1 Answers1

0

I'm not entirely familiar with AT&T syntax, but the disassembly of NASM in what you're accustomed to should suffice.

You should try to avoid what is called hard coding constants as it makes your program harder to maintain, especially when it's hundreds if not thousands of lines in length. Therefore;

            section .data       
Values:     db      1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 26, 18, 12, 20, 19, 11
V_Size      equ     $ - Values

is preferable to this

    message:
    .long 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

length:
    .long 10

What you did is not wrong, but the method is predicated upon you counting, not the assembler. As it has already been pointed out, use the smallest data size required to get the job done. In this case char is better than long

This code in NASM

        section .text
    global  _start

_start: xor     ecx, ecx
        push    rcx                 ; Applications default return value
        mov      cl, V_Size
        push    rcx
        mov     ebx, Values
        push    rbx

    Next:
        or      byte [ebx], 64
        inc     ebx
        loop    Next

        pop     rsi
        pop     rdx
        pop     rax
        inc      al
        mov     edi, eax
        syscall

        mov     edi, eax
        dec     edi
        mov     eax, edi
        mov      al, 60
        syscall

        section .data       
Values:     db      1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 26, 18, 12, 20, 19, 11
V_Size      equ     $ - Values

will yield

ABCDEFGHIJZRLTSK

with command prompt immediatly after "K".

            section .data:
 6000d8 01020304 05060708 090a1a12 0c14130b

        section .text:

<_start>:   These two instructions are idiosyncratic to my style of programming and not
            essential to functionality of program. 

  4000b0:   31 c9                   xor    %ecx,%ecx
  4000b2:   51                      push   %rcx

            Setup RCX & RBX for LOOP instruction

  4000b3:   b1 10                   mov    $0x10,%cl
  4000b5:   51                      push   %rcx                 ARG2 to syscall
  4000b6:   bb d8 00 60 00          mov    $0x6000d8,%ebx
  4000bb:   53                      push   %rbx                 ARG1 to syscall

<Next>:     This conforms to the scope of your objective.

  4000bc:   67 80 0b 40             orb    $0x40,(%ebx)         [ebx] += 'A'
  4000c0:   ff c3                   inc    %ebx
  4000c2:   e2 f8                   loop   4000bc <Next>

            ssize_t write (int fd, const void *buf, size_t count);

  4000c4:   5e                      pop    %rsi                 ARG1 = ASCII Pntr
  4000c5:   5a                      pop    %rdx                 ARG2 = # of chars
  4000c6:   58                      pop    %rax
  4000c7:   fe c0                   inc    %al                  SYS_WRITE
  4000c9:   89 c7                   mov    %eax,%edi            ARG0 = STD_OUT
  4000cb:   0f 05                   syscall

            Epilogue: Again, just a method I use.

  4000cd:   89 c7                   mov    %eax,%edi  
  4000cf:   ff cf                   dec    %edi
  4000d1:   89 f8                   mov    %edi,%eax
  4000d3:   b0 3c                   mov    $0x3c,%al
  4000d5:   0f 05                   syscall 
Shift_Left
  • 1,208
  • 8
  • 17
  • Why are you using such weird uncommented NASM code to set up the registers for a system call? Your SYS_exit (eax=60) seems to depend on write returning a positive value (not an error code). So you could make your program run `syscall` with `eax=0xFFFFFF3c` by running `./a.out > /dev/full`, `./a.out >&-` (close stdout) or any other way to make sys_write fail. The kernel looks at the whole of EAX, not just AL, for the syscall number (x86-64 syscall numbers go up to `__NR_execveat 322` currently). Oh, I see you have comments in the AT&T disasm output, but they don't explain anything there. – Peter Cordes Sep 16 '16 at 00:12
  • Even if you do want your exit status dependent on write(), the sane way to write the same code is half the instructions: `lea edi, [rax-1]` / `mov eax, SYS_write`. There's zero advantage to your confusing mov/dec/mov/mov imm. – Peter Cordes Sep 16 '16 at 00:16
  • Similarly, there's zero advantage, and a lot of confusion, from your crazy PUSH/POP code earlier. `xor ecx, ecx` / `mov cl, V_Size` saves one byte vs. MOV r32, imm32, and only works if V_Size fits in 8 bits. (Like you said, avoid hard-coding assumptions into your program!) It also wastes an instruction, which matters more than one code byte on any CPU that supports AMD64. [LOOP is a really slow instruction that you shouldn't use either.](http://stackoverflow.com/questions/35742570/why-is-the-loop-instruction-slow-couldnt-intel-have-implemented-it-efficiently) – Peter Cordes Sep 16 '16 at 00:20
  • Even though your comments have varying degrees of legitimacy, they are superfluous to the question **How to convert number into ASCII and write to display buffer**. To that end, code between 4000BC & 4000CD addresses that implicitly. The point is, this example reads the array, converts it appropriately and displays on console. – Shift_Left Sep 16 '16 at 00:55
  • Working code is not sufficient to make a good answer. It's important that it be readable / understandable to human audiences. Your text explanation of having the assembler calculate the length is very good, and makes that part of your code clear. The `[ebx] += 'A'` comment in your AT&T disassembly sort of gets the point across, but I wouldn't be surprised if beginners (the only people that won't already know the answer) would have a hard time following your code. – Peter Cordes Sep 16 '16 at 01:26
  • If you were trying to demonstrate efficient programming or something, that can be ok *if* you explain in comments why your code does what it does. Feel free to write answers however you like, but I'm trying to make suggestions that would make this answer more useful to future readers. IDK if you care about upvotes, but the reasons stated above are why I didn't upvote it. I'd be interested to see comments explaining the motivation / advantages for your unusual way of doing things. – Peter Cordes Sep 16 '16 at 01:27