2

Background:

I have been learning x86_64 assembly using NASM on a Linux system and was writing a subroutine for strlen(const char *str)

I wanted to copy one byte starting from a pointer stored in rax and comparing it to 0 to find the end of the string, but when I used mov rbx, [rax] I didn't get the results I wanted which I later learned to be a wrong approach as rbx is quadword length and mov will copy 8 bytes at a time. The suggested solution was to use bl instead which did work.

I also found in AT&T syntax the movzbq mnemonic to copy one byte and zero extend it to a quadword, which in my newbie eyes looked as a neater more accurate representation of the intended instruction as I am not only moving to bl but also erasing the remaining 7 bytes.


Question:

Is there an equivalent to movzbq and other AT&T mov variants in NASM?

Which is a better code practice, movzbq rbx, [address] or mov bl, [address]?


Thank you,

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Ziad
  • 23
  • 1
  • 3
  • 2
    Near duplicate of [How to load a single byte from address in assembly](https://stackoverflow.com/q/20727379), except for the part about how AT&T syntax works. The top of that answer suggests the less-useful way, mov into a byte reg, but the rest of the answer suggests the standard way, zero-extend into the full register. – Peter Cordes Jan 18 '22 at 03:22
  • @PeterCordes Thank you for the referral. I am new to asking questions here on stackoverflow, is there an easy way to find duplicates before posting or is it up to community members to notice them? – Ziad Jan 18 '22 at 04:04
  • 1
    Usually with google, e.g. `site:stackoverflow.com x86-64 load byte` (or maybe "zero extend byte", but I didn't try that"). Or "translate AT&T to Intel", e.g. assemble with GAS, disassemble with objconv into NASM syntax, or with `objdump -drwC -Mintel`). But it's not generally expected that newbies will come up with the right search terms if they were thinking about a problem from a different direction or not aware of standard terminology. I already knew of the existence of that question (since I wrote half the answer) so it comes up high in my google searches, too. – Peter Cordes Jan 18 '22 at 04:15

1 Answers1

4

In Intel syntax, the size of memory operands is indicated by a prefix on the operand. So you would write movzx rbx, byte [address].

However, writes to 32-bit registers automatically zero-extend into the 64-bit register. So movzx ebx, byte [address] is equivalent and saves one byte of code (no REX prefix needed).

Generally movzx ebx, byte [address] is preferable to mov bl, [address] because it overwrites the entire register. Writing to the byte-sized register can have a minor performance penalty in some cases, see Why doesn't GCC use partial registers? for details. Though mov bl, [address] is fewer bytes of code, which may be worth the tradeoff if you need to optimize for size.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82