The movzx
instruction zero-extends a value with a smaller width to fit into a larger-width register. For example, movzx
would be used to move a 16-bit value into a 32-bit register. (It is contrast to movsx
, which does the same thing except with sign extension. You would use movzx
when the value is unsigned, and movsx
when the value is signed.)
As you point out, these instructions were not introduced until the 386, so if you're targeting an earlier generation of processor, then you'll need to find an alternative.
The basic strategy is, as others pointed out in the comments, to zero the destination register first, and then move the smaller value in. This will accomplish exactly the same thing as movzx
. The obvious way to zero a register is with mov reg, 0
, but it is smaller and faster to do it using xor reg, reg
. Therefore, code like:
movzx edx, WORD PTR [bx]
can be replaced with:
xor edx, edx
mov dx, WORD PTR [bx]
On modern processors, this is slower than movzx
, but it will actually be faster on the 386 and 486, where movzx
is relatively slow. And of course, on processors where movzx
doesn't exist, you have no choice. You can further minimize the cost by issuing the xor
instruction earlier, interspersing it amongst other code.
One significant disadvantage of this approach is that you cannot do an in-place zero-extension on a value stored in a register. That is, there is no way to use this trick when you have code like:
movzx edx, dx
Instead, you would have to use a temporary register:
xor eax, eax
mov ax, dx
mov dx, ax ; optional, if you really needed the result to be in DX
Or, if you were zero-extending an 8-bit value, you could take advantage of the fact that the upper and lower 8-bit halves of a 16-bit register can be accessed independently on x86, and simply zero the upper 8 bits. For example:
mov al, BYTE PTR [bx]
xor ah, ah
; now read from value in AX
Note that this works for in-place zero-extension—just zero the high 8 bits. However, this technique cannot be used to zero-extend a 16-bit value, since there is no way to access just the upper 16 bits of a 32-bit register.
Fortunately, the need for zero-extension on these older architectures is much less than it is on modern architectures, since you don't have to guard as vigorously against partial-register stalls and false dependencies.
In comments, the concern was raised that all of the alternatives to movzx
require more than one instruction. Of course, that is true. If there were a way to do it in a single instruction, there wouldn't have been a need for the 386 to introduce movzx
. If you're worried about execution speed, consider what I said above that xor
+mov
will be equally as fast as movzx
would have been if it were available, if not faster.
If you're worried about number of instructions, then rest assured that less code does not necessarily mean faster code. In fact, in many cases, adding additional instructions can make your program execute faster. If you are trying to optimize a particular chunk of code, I encourage you to ask a question about it either here or on Code Review (we need more assembly language questions there!).