The BYTE=8bits
, WORD=16bits
, and DWORD=32bits
(double-word) terminology comes from Intel's instruction mnemonics and documentation for 8086. It's just terminology, and at this point doesn't imply anything about the size of the "machine word" on the actual machine running the code.
My guess:
Those C type names were probably originally introduced for the same reason that C99 standardized uint8_t
, uint16_t
, and uint32_t
. The idea was probably to allow C implementations with an incompatible ABI (e.g. 16 vs. 32-bit int
) to still compile code that uses the WinAPI, because the ABI uses DWORD
rather than long
or int
in struct
s, and function args / return values.
Probably as Windows evolved, enough code started depending in various ways on the exact definition of WORD and DWORD that MS decided to standardize the exact typedef
s. This diverges from the C99 uint16_t
idea, where you can't assume that it's unsigned short
.
As @supercat points out, this can matter for aliasing rules. e.g. if you modify an array of unsigned long[]
through a DWORD*
, it's guaranteed that it will work as expected. But if you modify an array of unsigned int[]
through a DWORD*
, the compiler might assume that didn't affect array values that it already had in registers. This also matters for printf
format strings. (C99's <stdint.h>
solution to that is preprocessor macros like PRIu32
.)
Or maybe the idea was just to use names that match the asm, to make sure nobody was confused about the width of types. In the very early days of Windows, writing programs in asm directly, instead of C, was popular. WORD/DWORD makes the documentation clearer for people writing in asm.
Or maybe the idea was just to provide a fixed-width types for portable code. e.g. #ifdef SUNOS
: define it to an appropriate type for that platform. This is all it's good for at this point, as you noticed:
How is it that the Windows API can typedef unsigned short WORD; and then say WORD is a 16-bit unsigned integer when the C Standard itself does not guarantee that a short is always 16 bits?
You're correct, documenting the exact typedef
s means that it's impossible to correctly implement the WinAPI headers in a system using a different ABI (e.g. one where long
is 64bit or short
is 32bit). This is part of the reason why the x86-64 Windows ABI makes long
a 32bit type. The x86-64 System V ABI (Linux, OS X, etc.) makes long
a 64bit type.
Every platform does need a standard ABI, though. struct
layout, and even interpretation of function args, requires all code to agree on the size of the types used. Code from different version of the same C compiler can interoperate, and even other compilers that follow the same ABI. (However, C++ ABIs aren't stable enough to standardize. For example, g++
has never standardized an ABI, and new versions do break ABI compatibility.)
Remember that the C standard only tells you what you can assume across every conforming C implementation. The C standard also says that signed integers might be sign/magnitude, one's complement, or two's complement. Any specific platform will use whatever representation the hardware does, though.
Platforms are free to standardize anything that the base C standard leaves undefined or implementation-defined. e.g. x86 C implementations allow creating unaligned pointers to exist, and even to dereference them. This happens a lot with __m128i
vector types.
The actual names chosen tie the WinAPI to its x86 heritage, and are unfortunately confusing to anyone not familiar with x86 asm, or at least Windows's 16bit DOS heritage.
The 8086 instruction mnemonics that include w
for word and d
for dword were commonly used as setup for idiv
signed division.
These insns still exist and do exactly the same thing in 32bit and 64bit mode. (386 and x86-64 added extended versions, as you can see in those extracts from Intel's insn set reference.) There's also lodsw
, rep movsw
, etc. string instructions.
Besides those mnemonics, operand-size needs to be explicitly specified in some cases, e.g.
mov dword ptr [mem], -1
, where neither operand is a register that can imply the operand-size. (To see what assembly language looks like, just disassemble something. e.g. on a Linux system, objdump -Mintel -d /bin/ls | less
.)
So the terminology is all over the place in x86 asm, which is something you need to be familiar with when developing an ABI.
More x86 asm background, history, and current naming schemes
Nothing below this point has anything to do with WinAPI or the original question, but I thought it was interesting.
See also the x86 tag wiki for links to Intel's official PDFs (and lots of other good stuff). This terminology is still ubiquitous in Intel and AMD documentation and instruction mnemonics, because it's completely unambiguous in a document for a specific architecture that uses it consistently.
386 extended register sizes to 32bits, and introduced the cdq
instruction: cdq
(eax (dword) -> edx:eax (qword)). (Also introduced movsx
and movzx
, to sign- or zero-extend without without needing to get the data into eax
first.) Anyway, quad-word is 64bits, and was used even in pre-386 for double
-precision memory operands for fld qword ptr [mem]
/ fst qword ptr [mem]
.
Intel still uses this b/w/d/q/dq convention for vector instruction naming, so it's not at all something they're trying to phase out.
e.g. the pshufd
insn mnemonic (_mm_shuffle_epi32
C intrinsic) is Packed (integer) Shuffle Dword. psraw
is Packed Shift Right Arithmetic Word. (FP vector insns use a ps
(packed single) or pd
(packed double) suffix instead of p
prefix.)
As vectors get wider and wider, the naming starts to get silly: e.g. _mm_unpacklo_epi64
is the intrinsic for the punpcklqdq
instruction: Packed-integer Unpack L Quad-words to Double-Quad. (i.e. interleave 64bit low halves into one 128b). Or movdqu
for Move Double-Quad Unaligned loads/stores (16 bytes). Some assemblers use o
(oct-word) for declaring 16 byte integer constants, but Intel mnemonics and documentation always use dq
.
Fortunately for our sanity, the AVX 256b (32B) instructions still use the SSE mnemonics, so vmovdqu ymm0, [rsi]
is a 32B load, but there's no quad-quad terminology. Disassemblers that include operand-sizes even when it's not ambiguous would print vmovdqu ymm0, ymmword ptr [rsi]
.
Even the names of some AVX-512 extensions use the b/w/d/q terminology. AVX-512F (foundation) doesn't include all element-size versions of every instruction. The 8bit and 16bit element size versions of some instructions are only available on hardware that supports the AVX-512BW extension. There's also AVX-512DQ for extra dword and qword element-size instructions, including conversion between float/double and 64bit integers and a multiply with 64b x 64b => 64b element size.
A few new instructions use numeric sizes in the mnemonic
AVX's vinsertf128
and similar for extracting the high 128bit lane of an 256bit vector could have used dq
, but instead uses 128
.
AVX-512 introduces a few insn mnemonics with names like vmovdqa64
(vector load with masking at 64bit element granularity) or vshuff32x4
(shuffle 128b elements, with masking at 32bit element granularity).
Note that since AVX-512 has merge-masking or zero-masking for almost all instructions, even instructions that didn't used to care about element size (like pxor
/ _mm_xor_si128
) now come in different sizes: _mm512_mask_xor_epi64
(vpxorq
) (each mask bit affects a 64bit element), or _mm512_mask_xor_epi32
(vpxord
). The no-mask intrinsic _mm512_xor_si512
could compile to either vpxorq
or vpxord
; it doesn't matter.
Most AVX512 new instructions still use b/w/d/q in their mnemonics, though, like VPERMT2D
(full permute selecting elements from two source vectors).