How is a tword ten bytes instead of 20?

Question

So I was trying to find out the sizes of zword and yword when I stumbled upon this helpful thread which had the list of all sizes, but one thing confused me. It says in the edit that a tword is 10 bytes, but that.., doesn't really make sense to me. A word is two bytes, so a tword should be 10 words AKA 20 bytes, no? I found it even more weird that an oword was indeed 8 words/16 bytes, so an oword is actually larger than a tword. Is there an explanation for this odd choice of naming?

In MASM and TASM it's called `TBYTE` (i.e. **T**en Bytes). Why the NASM devs picked the name `TWORD` I don't know. Perhaps because all the other types except `BYTE` are called `*WORD`. I guess `QWORD` (for **Q**uint Word) would've been more accurate, but `QWORD` was already taken. — Michael, Feb 05 '21 at 09:37
I haven't used MASM or TASM, so that was interesting to know. Your reasoning does seem to somewhat make sense though. — mediocrevegetable1, Feb 05 '21 at 09:42
@Michael `T` stands for *temporary*, not *ten*. After all, the instruction `fstp tword` is glossed “floating point store temporary and pop.” — fuz, Feb 05 '21 at 12:05

Peter Cordes · Accepted Answer · 2021-02-06T02:49:44.973

In NASM, all the things larger than 1 byte are *WORD, e.g. yword for the size of a YMM vector. (@Michael noted this in comments; sounds like an intentional pattern to me.)

This leads to the silly name of TWORD, a ten-byte thing made of words? Don't think too hard about breaking the name down into its parts, but that's probably your best bet if you can't help it.

TBYTE used by MASM / TASM (and GAS .intel_syntax according to objdump -drwC -Mintel) makes more sense as a meaningful / sensible name. NASM certainly wanted a T in the name for some consistency with existing assemblers.

Alternatively, @fuz suggests that T stands for "temporary", as the primary intended use-case for x87's extra precision vs. IEEE binary64 double is as temporaries during computations. Usually (for that use case) they can just live in registers, but sometimes you might want to spill them. And of course you can use them as long double extended precision with (significantly) worse load/store performance than double, or have constants in that format. See also Bruce Dawson's Intermediate Floating-Point Precision article for more real-world x87 and compiler stuff, part of an excellent series.

NASM has directives like resb / resd / rest / ... / that reserve space (for use in the BSS), and db / dd / dt / .... They only have 1 character for the size, and other than Byte and Word it's basically a size code. And unlike MASM, there's no DWORD directive you can use as an alternative to DD, so the "size code" is relatively more important, and the thing it's stuck onto is more regularized.

Of course NASM does already have to parse BYTE and WORD as operand-size codes, so it's hard to imagine TBYTE would have actually made the parser measurably harder to write or maintain. (As one answer on the Q&A linked in the question shows, the disassembler or something in NASM has a switch where each string is fully separate, not %cWORD for sizes > 2, but parsing could still be different.)

This seems like a plausible theory behind NASM's designer(s) wanting to stick to the *WORD pattern, but I have no information on it and didn't go looking for any mailing list docs. IDK if NASM was designed collaboratively in public, or if it was pretty much the original author. Either way I'm guessing it seemed like a good idea to someone in the early days of designing the syntax.

FASM apparently supports both TBYTE and TWORD names for the same size, but NASM only supports TWORD. Adding a new reserved / special keyword to the language could perhaps have broken backwards compat with code that inadvisably used TBYTE as a symbol name, or maybe NASM developers just never even wanted to change.

I checked the manual and it says NASM uses `tword` "for historical reasons". I'm not sure what those historical reasons are, but your theory does seem very plausible. — mediocrevegetable1, Feb 05 '21 at 10:00
@mediocrevegetable1: that would be consistent with "seemed like a good idea at the time", and then decided not to break backwards compat by changing the language's keywords once someone made the same point you did. (Even introducing a new keyword and keeping TWORD as an alternative could in theory break code that happened to use TBYTE as a label name or something.) — Peter Cordes, Feb 05 '21 at 10:03
In FASM too, `TBYTE` and `TWORD` are synonymous for **10** bytes. Additionally there's `FWORD` and `PWORD` that are synonymous for **6** bytes. — Sep Roland, Feb 05 '21 at 23:42
According to [fuz's comment on another answer](https://stackoverflow.com/questions/52733927/asm-operand-type-mismatch-for-cmp#comment92423974_52737154), *`fstpt` is explicitly called “floating point store temporary” in the 8087 datasheet* — Peter Cordes, Apr 05 '21 at 21:37

How is a tword ten bytes instead of 20?

1 Answers1