36

I'm currently learning assembly programming on Intel x86 processor.

Could someone please explain to me, what is the difference between MMX and XMM register? I'm very confused in terms of what functions they serve and the difference and similarities between them?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Thor
  • 9,638
  • 15
  • 62
  • 137
  • 3
    The [mmx tag wiki](https://stackoverflow.com/tags/mmx/info) has a bit of stuff. The [SSE tag wiki](https://stackoverflow.com/tags/sse/info) has a History section that explains about XMM registers being new architectural state, unlike the MMX regs. It mentions SSE1 through SSE4.x, because the `[sse]` tag is kind of a catch-all for different versions of SSE. There are also some good links to programming guides / slides / manuals about learning SIMD. – Peter Cordes Oct 23 '17 at 22:06

1 Answers1

44

MM registers are the registers used by the MMX instruction set, one of the first attempts to add (integer-only) SIMD to x86. They are 64 bit wide and they are actually aliases for the mantissa parts of the x87 registers (but they are not affected by the FPU top of the stack position); this was done to keep compatibility with existing operating systems (which already saved the FPU stack on context switch), but made using MMX together with floating point a non trivial job.

Nowadays they are just a historical oddity, I don't think anybody actually uses MMX anymore, as it has been completely superseded by the various SSE extensions. Edit: as Peter Cordes points out in the comments, there is still quite some MMX code around.


XMM registers, instead, are a completely separate registers set, introduced with SSE and still widely used to this day. They are 128 bit wide, with instructions that can treat them as arrays of 64, 32 (integer and floating point),16 or 8 bit (integer only) values. You have 8 of them in 32 bit mode, 16 in 64 bit. Virtually all floating point math is done in SSE (and thus XMM registers) in 64 bit mode, so, unlike MMX registers, they are still quite relevant.

Nowadays you may also meet the YMM and ZMM registers; they were introduced respectively with the AVX (2011) and AVX-512 (2015) instruction sets, and they expand the XMM registers, not unlike the e and r extensions to the general-purpose registers (rax extended eax which extended ax which can be accessed as ah:al).

In an AVX-capable processor, each register in the XMM register file is expanded to 256 bits. The whole 256 bit register is referred to as YMMx (x from 0 to 15) and can be used by the new AVX instructions, the lower half is XMMx, and can be still used by older SSE instructions.

Similarly, AVX-512 expands the registers above to 512 bit; the whole register is ZMMx (usable with the AVX-512 instructions), the lower 256 bit is YMMx (also usable with AVX instructions), the lower 128 bits are still XMMx (usable also with SSE). Also, the register count is increased to 32, so these registers are both bigger and twice in number.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • hi Matteo, thank you very much for the detailed answer! It seems to be that you are very knowledgable when it comes to assembly language and I'm just a beginner, do you mind if I ask what is the learning path you took to get to where you are right now? Thanks again for your help! – Thor Jun 02 '17 at 05:52
  • 2
    Well, my learning of assembly was mostly incidental... what I know of x86 assembly comes most from profiling, post-mortem debugging and reverse engineering of C/C++ applications, plus some fun with [code golf challenges](https://codegolf.stackexchange.com/users/9298/matteo-italia), that I usually solve in 16-bit x86 assembly. For a long time I didn't really ever explicitly *study* it, it just crossed my path many times that I ended up knowing something about it. Then I "hardened" a bit my knowledge of it by reading a decent chunk of the Intel IA32/EMT64 manuals. – Matteo Italia Jun 02 '17 at 17:31
  • 2
    Another reason nobody uses MMX anymore is because x86-64 doesn't support it at all. – Cody Gray - on strike Jun 04 '17 at 11:10
  • 5
    x264 and ffmpeg still have lots of MMX code, some of which is used even on CPUs with SSE2 and AVX2. It would probably be a minor speedup to rewrite some of the functions to use the low half of an XMM for 8B stuff. @CodyGray: that's total nonsense. x86-64 (the ISA) definitely includes x87/MMX registers. All modern x86-64 OSes save/restore the x87 state (and thus the MMX state). Microsoft has said some stuff about MMX being deprecated / discourages for x86-64, but even they are clear that it does actually work at least at the asm level, whether or not their compiler intrinsics still work. – Peter Cordes Oct 23 '17 at 22:00
  • Terminology: "register file size" would normally mean the number of *physical* registers, not the number of *architectural* registers. I'm not sure if there is a standard term that would fit your context. Maybe "register set size" or "register count"? (Also, 32-bit code can still only use zmm0-zmm7. 32-bit code is so obsolete for high-performance that it's maybe not worth complicating the answer with that wrinkle, though.) – Peter Cordes Oct 23 '17 at 22:11
  • @PeterCordes: yep, register count sounds way better. For the 32 bit part, I didn't even know that you could use AVX-512 from 32 bit code, go figure :-D – Matteo Italia Oct 23 '17 at 22:41
  • @Peter If I remember correctly, MASM64 issues an error when you try to use MMX instructions, and MSVC certainly complains about it. I don't know if that's changed in recent versions. Sure, the chip supports it, but that's rather different than having OS-level support. Last I looked at this, there were very clear statements from Microsoft that 64-bit versions of Windows do *not* support MMX and you essentially continue to use it at your own risk. Maybe that's all FUD, but I don't know what their motivation would be. – Cody Gray - on strike Oct 24 '17 at 12:42
  • 2
    @CodyGray: AFAICT, it's all FUD. https://chessprogramming.wikispaces.com/MMX?responseToken=e9501a67e04a4f7c3efc2f4cde3d715e#MMX%20and%2064-bit%20Windows. MS's toolchain might refuse to assemble MMX or x87 instructions, but OS-level context-switch support still exists. The GNU toolchain (and NASM/YASM) work as normal for MMX on Win64, AFAIK. However, kernel code can't use MMX, only SSE/AVX: https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/floating-point-support-for-64-bit-drivers. – Peter Cordes Oct 24 '17 at 23:40
  • 1) Which is faster, FPU or XMM in terms of arriving at the result? 2) Why does MASM32 teach FPU if a better alternative is available? 3) Is it even possible to display a 128/256-bit value? I'm asking that because no single datatype in C++ is that wide. 4) Are there any types in MASM that are that wide? I believe the two widest that I know of are the TBYTE and REAL10. 5) If not, how can C++ guarantee a 96-bit or 12-byte type (like long double is on some systems) if the widest type is only 80 bits? Also, how can .NET expose the 96-bit System.Decimal type, which works perfectly? – Thomas May 05 '20 at 21:41
  • 1
    1) Generally SSE 2) No idea, never used MASM32; however x87 is the "traditional" way to go in 32 bit code. 3) "It's complicated". Those "fat" registers actually [hold an array of values of regular sizes](https://www.codingame.com/playgrounds/283/sse-avx-vectorization/what-is-sse-and-avx); the only type bigger than regular ones is 128 bit integer. SSE-supporting compilers generally provide the relevant extra types that represent an SSE register. 4) `XMMWORD`/`YMMWORD` 5 & 6) I don't see the problem... types exposed by a HLL are not limited by the hardware. – Matteo Italia May 06 '20 at 08:17
  • does registers zmm16 - zmm31 support xmm16 - xmm31 and ymm16 - ymm31? – Sourav Kannantha B Jul 11 '21 at 13:55
  • @SouravKannanthaB I'm not sure I understand your question... zmm registers are an expansion of the ymm registers, which are an expansion of the zmm ones. On a machine supporting zmm, you can still use the older instructions that refer to the low part of such registers as xmm. – Matteo Italia Jul 11 '21 at 23:54
  • This answer does not make it clear whether it is possible to use an XMM register as a single 128-bit floating-point value. Or is 2x64 the widest format one can get out of an XMM register? – AnT stands with Russia Apr 16 '22 at 16:08
  • 1
    2x64 is as wide as you can go — I thought it was clear from the sentence «They are 128 bit wide, with instructions that can treat them as arrays of 64, 32 (integer and floating point), 16 or 8 bit (integer only) values» – Matteo Italia Apr 16 '22 at 17:13