-1

I have an assembly program where it iterates through an array and calculates the sum as it iterates. When i change myArray to byte EDX = 01030203 whereas when i change myArray to DWORD EDX = 00000003 can anyone tell me why this happens

.data

    myArray byte 3, 2, 3, 1, 7, 5, 8, 9, 2


.code
main PROC
    
    
    mov eax, 0
    mov ecx, 9
    mov esi, OFFSET myArray

top:
    mov edx, 0
    mov edx, [esi]
    add eax, edx
    inc esi
    dec ecx
    jnz top




main ENDP
END main
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Same as the difference between `uint8_t` and `uint32_t` in C. (And always loading via a `uint32_t*` since EDX is a dword register. But without strict-aliasing UB since this is asm.) – Peter Cordes Nov 29 '22 at 18:24
  • 1
    There's several things here. First is a duplicate, of what is byte vs. word here https://stackoverflow.com/a/7750439/471129. Second is what happens when you have size-mismatched (e.g. byte) data but use dword-sized loads and stores (you'll load multiple data elements as if together they were a single number). And third, how to use byte vs. dword, is: use byte-sized loads (e.g. `mov al,...`) with byte data and dword-sized loads with dword-sized data (e.g. `mov eax,...`). – Erik Eidt Nov 29 '22 at 20:42

1 Answers1

2

A dword takes up 4 bytes, regardless of how "big" the number you wrote there appears to be.

For example:

.data myArray dword 1,2,3,4,5

In a hex editor, if you looked at the .data section of the above assembled program (the .exe file, not the source code document), this is what you would see:

01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00
05 00 00 00

Meanwhile if you have the following:

.data myArray byte 1,2,3,4,5

you would instead see 01 02 03 04 05 in your hex editor.

x86 has different register sizes for loading from memory. When you do something like mov edx,[esi], the e in edx implies that esi is a pointer to data that you declared as a dword. The CPU makes this assumption and loads edx as if you had declared in your code the following:

myArray dword 0x01030203,0x09080507,0x??????02 

I'm using question marks because since the values were undeclared, they could be anything (probably zeroes but who knows).

The important takeaway is that the CPU has no idea what type your data actually is. It relies on the programmer to use the correct destination register sizes for loading and storing. Luckily, this is a pretty easy fix.

MOV DL,[ESI]  ;you can also use MOV DH,[ESI]

In general, the "main registers" ax,bx,cx,and dx follow this pattern:

  • ax is a 16-bit register. It is the "low half" of eax.
  • eax is a 32-bit register. It is the "low half" of rax.
  • al is the "low half" of ax, and ah is the "high half" of ax.
puppydrum64
  • 1,598
  • 2
  • 15