Assembly do we need the endings?

Question

In assembly (att) the following is legal:

mov %rax, %rbx

which is equal to:

movq %rax, %rbx

Where q means the first parameter is 64 bits, my question is:

Is this q (or other endings) used only for simplicity of human reading or there could be some cases where no writing q would give wrong answer or different result than the expected one or even crash the code (illegal command), please give me an example if possible.

`movq $1, (%rbx)` needs the Q to determine the size of the memory operand. — ecm, May 25 '21 at 15:25
I think there have been other duplicates of this, as well as the one I linked, but maybe only for NASM. (It's the same for all assemblers, except bad ones that have some random default for ambiguous cases). The one I linked does mention AT&T syntax in Cody's answer. — Peter Cordes, May 25 '21 at 17:52

Nate Eldredge · Accepted Answer · 2021-09-01T14:45:22.623

You're asking about the operand size suffix. There are two cases:

For many instructions, the operand size can be inferred from the operands themselves, typically because they are registers of a particular size. This is like your example: mov %rax, %rbx must have a 64-bit operand size, because %rax, %rbx are 64-bit registers. In this case, the suffix is optional, and the same machine code (48 89 c3) is generated whether you write mov %rax, %rbx or movq %rax, %rbx. It's purely a matter of style whether you choose to include it or not (though certainly people have opinions about which way is better).

If you provide a suffix that is inconsistent with the operands, e.g. movl %rax, %rbx, the assembler will give a warning or error.
In others, the operand size cannot be inferred. The most common case is when one operand is an immediate and the other is a memory reference. If you write mov $1, (%rbx) for ecm's example from comments, it is ambiguous: should the assembler emit an instruction to store a byte (machine code c6 03 01), or a word (two bytes, 66 c7 03 01 00), or a long (four bytes, c7 03 01 00 00 00), or a quad (eight bytes, 48 c7 03 01 00 00 00)? So a suffix is required in this case: you must write movb $1, (%rbx) or movw $1, (%rbx) and so on.

If you omit the suffix in such a case, recent assembler versions should at least warn you. Some will then abort with an error; others may guess at the operand size or use some built-in default, as Peter comments below. Some older assembler versions would actually revert to a default without warning.

So in principle, yes, omitting the suffix could lead to "wrong" code, in some cases and with some assemblers. However, current versions of the widely used AT&T-syntax assemblers would at least warn you.

There is however one other way that this can sort of happen: suppose you want to add 5 to the 32-bit register eax (addl $5, %eax), but you make a typo and leave off the e. If you are in the habit of using the suffixes, you would write addl $5, %ax and get an assembly error, alerting you to your mistake. If your style is to omit them, you would write add $5, %ax and the code would build perfectly but would be "wrong".

Fun fact: for ambiguous instructions other than `mov`, GAS defaults to dword: e.g. `add $1, (%rax)` assembles to `83 00 01 add DWORD PTR [rax],0x1`. In the last year or so, GAS finally added a warning: *Warning: no instruction mnemonic suffix given and no register operands; using default for 'add'*. I don't know how GAS ended up with this bad design, or why it didn't even warn until a very recent version. (I have `as` 2.35.1), although I read something about Unix assemblers having a dword default. clang's built-in assembler (correctly) rejects the ambiguous instruction. — Peter Cordes, May 25 '21 at 17:32

Assembly do we need the endings?

1 Answers1

Linked

Related