Why do we need to disambiguate when adding an immediate value to a value at a memory address

Question

Explains that unless we specify a size operator (such as byte or dword) when adding an immediate value to a value stored at a memory address, NASM will return an error message.

section .data           ; Section containing initialized data

    memory_address: db "PIPPACHIP"

section .text           ; Section containing code

global  _start          ; Linker needs this to find the entry point!

_start:

23            mov ebx, memory_address
24            add [ebx], 32

........................................................

24:  error: operation size not specified.

Fair’s fair.

I’m curious as to why this is so however. As the two following segments of code will yield the same result.

add byte [ebx], 32

or

add dword [ebx], 32

So what difference does it make? (Other than not making much sense as to why you would use dword in this instance). Is it simply because “NASM says so”? Or is there some logic here that I am missing?

If the assembler can decipher the operand size from a register name, for example add [ebx], eax would work, why not do the same for an immediate value, i.e. just go ahead and calculate the size of the immediate value upfront.

What is the requirement that means a size operator needs to be specified when adding an immediate value to a value at a memory address?

NASM version 2.11.08 Architecture x86

With your example indeed you won't see a difference, but try adding 32 to, say, 240. — Jester, Nov 22 '17 at 23:31
Try `add byte [ebx],250` vs `add dword [ebx],250` ... they should be the same according to your logic. But result will be different. — Ped7g, Nov 23 '17 at 00:47
@Jester Thank you. That makes sense. The size operator is instructing the CPU on how many bits to switch in memory, regardless of the resulting sum, and the carry flag is set if the result of the arithmetic “carries out” a bit, i.e. the CPU is not physically able to ‘record’ the result in the number of bits specified by the size operator. — Andrew Hardiman, Nov 24 '17 at 17:36
Related: [What's the difference between 0 and dword 0?](https://stackoverflow.com/q/34325587) for basics of what operand-size overrides actually do. — Peter Cordes, Oct 10 '20 at 10:07

Peter Cordes · Accepted Answer · 2020-10-10T10:06:50.403

5

It does matter what operand-size you use for several reasons, and it would be weird and unintuitive / non-obvious to have the size implied by the integer value. It's a much better design to have NASM error when there's ambiguity because neither operand is a register.

As the two following segments of code will yield the same result:
add byte [ebx], 32
add dword [ebx], 32

They only yield the same result because 'P' + 32 doesn't carry into the next byte.

Flags are set according to the result. If the 4th byte had its high bit set, then SF would be set for the dword version.

re: comments about how CF works:

Carry-out from an add is always 0 or 1. i.e. the sum of two N-bit integers will always fit in an (N+1)-bit integer, where the extra bit is CF. Think of the add eax, ebx as producing the result in CF:EAX, where each bit can be 0 or 1 depending on the input operands.

Also, if ebx was pointing at the last byte in a page, then dword [ebx] could segfault (if the next page was unmapped), but byte [ebx] wouldn't.

This also has performance implications: read-modify-write of a byte can't store-forward to a dword load, and a dword read-modify-write accesses all 4 bytes. (And correctness if another thread had just modified one of those other bytes before this thread stored the old value over it.)

For these and various other reasons, it matters whether the opcode for the instruction that NASM assembles into the output file is the opcode for add r/m32, imm8 or add r/m8, imm8.

It's a Good Thing that it forces you to be explicit about which one you mean instead of having some kind of default. Basing it on the size of the immediate would be confusing, too, especially when using a ASCII_casebit equ 0x20 constant. You don't want the operand-size of your instructions to change when you change a constant.

edited Oct 10 '20 at 10:06

answered Nov 22 '17 at 23:35

Peter Cordes

328,167
45
605
847

Thank you. I see. E.G. If I specify the size operator word, and the result of my addition increases the original data beyond the size of one byte (one memory address), I will be changing the data stored in the ‘next’ byte in memory as well, even if this not my intention. Whereas, if I specify byte, I will only be changing the data in the ‘first’ byte, regardless of the resulting sum; and a `CF` / `OF` will be set accordingly. What happens to the data that is carried? Is it lost? i.e. A carry is flagged, but there is no direct information as to what has been carried? Carry is not always 1. – Andrew Hardiman Nov 24 '17 at 17:15
2

@case_2501: carry-out from an `add` is always 0 or 1. i.e. the sum of two N-bit integers will always fit in an (N+1)-bit integer, where the +1 is CF. – Peter Cordes Nov 24 '17 at 22:15
1

@case_2501: there is no memory-destination `mul` or `imul`, and the one-operand form gives you the full-multiply result in (e)dx:(e)ax, or ah:al. e.g. `mul dword [ebx]` does `edx:eax = eax * dword [ebx]`. If you don't care about the upper half of the multiply (e.g. C semantics for `c = a*123`), you can do `imul r32, r/m32, imm8/32` (where the middle operand can be register or memory), or you can do `imul r32, r/m32` – Peter Cordes Nov 24 '17 at 22:20
“the sum of two N-bit integers will always fit in an (N+1)-bit integer”, this makes perfect sense. However, I am finding it difficult to understand how the carry is always 1 when, for example, we add an 8-bit integer to a 16-bit integer. If, in my original example, I add 4,095, instead of 32, `add byte [ebx], 4095` the carry flag is set accordingly. However, the flag is not indicating to me here that the next highest bit is now a 1, like it would be for the sum of two N-bit integers, rather just that there has been a carry of X value, how do you know X? – Andrew Hardiman Nov 25 '17 at 16:56
1

@case_2501: The carry-out isn't always `1`, it's set to the high bit of the `N+1` bit result. Think of the `add eax,ebx` as producing the result in `CF:EAX`, where each bit can be 0 or 1 depending on the input operands. – Peter Cordes Nov 25 '17 at 23:13
1

`4095` doesn't fit in an 8-bit immediate, so `add byte [ebx], 4095` is not encodeable. Many assemblers will truncate it to `add byte [ebx], 0xFF`. – Peter Cordes Nov 25 '17 at 23:13

Why do we need to disambiguate when adding an immediate value to a value at a memory address

1 Answers1

Linked

Related