movzx and cwd - are they interchangable?

Question

I hav these two code snippets:

mov ax, word [wNum2]
cwd
div word [wNum3]
mov [wAns16], dx

movzx eax, word [wNum2]
;cwd
div word [wNum3]
mov [wAns16], edx

The first produces the correct answer, the second will give me an answer that is a hundred or so off unless I uncomment cwd.

My question is that I thought movzx would zero everything out for me, and that would make cwd un-needed. Have I completely misunderstood how they work? Can someone walk me through these code snippets?

zx485 · Accepted Answer · 2020-02-01T03:38:02.897

The bare result can be equivalent or not - that depends on the value. The description of CWD states

Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register.

So if the value in AX is lower than 32,767 (15 bit MAX), the result of it is equivalent to MOVZX (zero extend) and MOVSX (sign extend). But if the value is greater, it would only be equivalent to MOVSX. Usually MOVZX would be used in combination with DIV(unsigned division) and MOVSX in combination with IDIV(signed division).

But there remains the problem of where the result will be stored:
CWD stores the 32-bit result in two 16-bit registers DX:AX, while the MOV?X instructions store it in the 32-bit register EAX.

This has consequences on the following DIV instruction. The first part of your code uses the 32-bit value in DX:AX as input, while the second approach assumes EAX to be the input of a 16-bit DIV:

F7 /6   DIV r/m16   M   Valid   Valid   Unsigned divide DX:AX by r/m16, with result stored in AX ← Quotient, DX ← Remainder.

which makes the result unpredictable, because DX is undefined and the higher half of EAX is unused in the division.

You should probably mention that you normally want to *zero*-extend before `div` (i.e. `xor edx,edx`), and only sign-extend before `idiv`. [When and why do we sign extend and use cdq with mul/div?](//stackoverflow.com/q/36464879) and [Why should EDX be 0 before using the DIV instruction?](//stackoverflow.com/a/38416896) — Peter Cordes, Feb 01 '20 at 03:25

score 2 · Answer 2 · answered Feb 01 '20 at 03:39

No, MOVZX is zero extension, not sign. And CWD sign-extends AX into DX:AX (like you want before IDIV, not DIV).

movSx eax, word [wNum2] is a more efficient way to do mov ax,mem + CWDE, not CWD. (If your inputs are known to be non-negative when treated as signed, sign and zero extension do the same thing).

What does cltq do in assembly? has a table of cbw/cwde/cdqe and the equivalent movsx instruction, and what cwd/cdq/cqo do (and the equivalent mov/sar).

None of these things are what you want before unsigned div: use xor edx,edx to zero DX, the high-half input for 32/16 => 16-bit division.

See also When and why do we sign extend and use cdq with mul/div?

To avoid false dependencies from writing partial registers, on most recent CPUs the most efficient thing would be to do a movzx load just to get your 16-bit value into AX without merging into the previous value of RAX/EAX. Similarly, xor-zeroing isn't (usually?) recognized as a zeroing idiom on partial registers so you want 32-bit operand-size even if you're only going to read the low half of

   movzx eax, word [wNum2]      ; zere extend only to avoid false dep from merging into EAX
   xor   edx, edx               ; high half dividend = DX = 0
   div   word [wNum3]
   mov   [wAns16], dx           ; store remainder from DX, not EDX

Your code storing 32-bit EDX into [wAns16] is presumably a bug, assuming there's only 2 bytes of space there before you step on whatever comes after it.

movzx and cwd - are they interchangable?

2 Answers2